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ABSTRACT 


In  recent  years,  significant  advances  have  been  made  in  the  field 
of  science  education.  Yet  there  remains  considerable  uncertainty  as  to 
exactly  how  or  under  what  circumstances  the  ultimate  objectives  of  science 
teaching  are,  in  fact,  realized. 

The  study  to  be  reported  here  arose  from  the  need  for  valid 
measures  of  the  effectiveness  of  science  education.  In  general,  the  pur¬ 
pose  of  the  study  was  to  assess  the  validity  of  certain  tests  purporting  to 
measure  the  extent  to  which  four  major  objectives  are  realized  in  science 
teaching. 

A  factor-analytic  technique  was  adopted  to  determine  the  statistical 
variables  which  could  best  account  for  differences  in  the  test  scores.  These 
variables  were  then  interpreted  to  find  the  extent  to  which  they  were  related 
to  the  objectives. 

The  four  objectives  chosen  for  the  investigation  were: 

1.  Knowledge  of  facts  and  principles. 

2.  Ability  in  problem-solving- -including  both  logical  reasoning  and 
creative  thinking. 

3.  Understanding  science. 

4.  Scientific  attitudes- -the  most  important  of  which  were  believed  to 
be  open-mindedness,  belief  in  cause  and  effect  relationships, 
intellectual  honesty,  carefulness  and  accuracy,  and  critical 
mindedness. 

Some  of  the  best  available  measures  of  these  objectives  and  two 
specially  constructed  tests  were  administered  to  a  representative  sample 
of  Science  X  students  from  the  public  high  schools  in  Edmonton.  Local 
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school  and  Departmental  examinations  were  also  included. 

The  following  conclusions  were  drawn  from  the  investigation: 

1.  The  STEP  Science  test  was  found  to  measure  knowledge  and  verbal 
facility,  but  did  not  prove  to  be  an  adequate  test  of  reasoning. 

2.  In  the  school  and  Departmental  examinations  both  knowledge  and 
reasoning  made  significant  contributions  to  scores. 

3.  The  Test  of  Understanding  Science  was  found  to  measure  the  know¬ 
ledge  implied  in  the  objective  concerned,  and  was  related  to  open- 
mindedness  and  belief  in  cause  and  effect  relationships. 

4.  The  evidence  supported  the  claim  that  the  Rokeach  scale  measures 
open-mindedness.  However,  the  measures  of  curiosity  and  of 
belief  in  cause  and  effect  relationships  were  heavily  dependent  on 
other  variables. 

Interestingly,  the  test  of  simple  recall  proved  to  be  dependent 
on  reasoning  as  well  as  knowledge.  The  contribution  of  reasoning  was 
apparently  due  to  the  fact  that  one  quarter  of  the  items  involved  some 
application,  though  this  was  of  a  simple  kind. 

At  the  conclusion  of  this  study  one  important  question  remained 
unanswered:  What  kinds  of  items  best  measure  reasoning  ability  in 
science?  A  suggestion  of  an  investigation  which  could  throw  light  on  this 
problem  has  been  advanced. 

The  investigation  has  highlighted  the  need  for  further  studies  to 
develop  satisfactory  instruments  to  evaluate  scientific  attitudes.  However, 
the  evidence  suggests  that  tests  of  understanding  science  do  give  some 
assessment  of  these  attitudes  as  well  as  important  factual  knowledge.  It 
has  been  suggested  that  the  evaluation  of  understanding  science  should  be 
attempted  more  extensively  in  schools. 
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CHAPTER  I 


THE  PROBLEM 


Introduction 

Science  educators  quite  frequently  make  the  claim  that  there  is 
a  gap  between  the  beliefs  held  regarding  the  objectives  of  science 
teaching  and  the  practice  of  those  engaged  in  the  process. 

The  review  of  science  education  in  the  most  recent  edition  of 
the  Encyclopedia  of  Educational  Research  (I960)  pointed  out  that  there 
is  deficiency  in  the  efforts  to  relate  objectives  to  classroom  practice. 
Also,  Carleton  (I960),  writing  in  the  Fifty-Ninth  Yearbook  of  the 
National  Society  for  the  Study  of  Education,  cited  studies  by  Beau¬ 
champ  and  Obourn  to  show  that  little  progress  has  been  made  in  the 
classrooms  across  the  United  States  of  America  in  attaining  some  of 
the  more  important  of  these  purposes.  It  was  in  an  attempt  to  alert 
science  educators  to  this  problem,  that  Anderson  (1950)  undertook 
his  "frontal  attack"  on  the  evaluation  of  the  achievement  of  objectives 
in  Minnesota  schools.  Whilst  there  does  not  seem  to  have  been 
comparable  investigations  in  other  countries  to  provide  the  basis  for 
such  conclusions,  one  gains  the  impression  that  similar  views  are 
held  elsewhere  to  a  greater  or  lesser  extent.  The  authors  of  the 
British  Ministry  of  Education  Pamphlet  Number  38  (I960)  emphasized 
that  the  ideals,  methods,  and  attitudes  of  the  scientific  approach  only 
become  part  of  a  pupil's  outlook  when  a  conscious  effort  is  made  to 
communicate  them,  and  they  pointed  to  the  need  for  this  in  the  inter¬ 
ests  of  a  liberal  education. 

Reading  the  many  discussions  of  objectives  of  science  teaching 
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reveals  differences  in  the  statements  of  these.  However,  the  claims 
of  those  who,  like  Carleton  (I960),  say  that  there  has  been  general 
agreement  regarding  the  purposes  for  almost  forty  years,  seem  to 
be  borne  out  upon  a  deeper  examination  of  statements  of  objectives. 
Thus  the  discussions  of  aims  by  Burnett  (1957),  Hurd  (I960),  and  the 
British  Ministry  of  Education  Inspectors  (I960)  for  example,  are  at 
first  sight  quite  dissimilar.  On  further  study  one  sees  that  object¬ 
ives  dealing  with  scientific  attitudes,  problem-solving  or  critical 
thinking,  and  an  understanding  of  science  as  a  human  activity,  as 
well  as  a  knowledge  of  facts  and  principles  of  science,  are  held  in 
each  case,  and  are  held  to  be  among  the  most  important  objectives. 

It  is  sometimes  difficult  to  compare  statements  of  objectives 
because  these  are  formulated  at  different  levels.  There  are  those, 
like  Burnett  (1957),  who  draw  attention  to  the  need  to  spell  out  aims 
in  terms  of  considerable  detail,  and  related  to  desired  behaviors,  so 
that  the  statements  are  sufficiently  specific  to  make  for  ease  in  trans¬ 
lation  to  classroom  practice.  There  are  others,  like  Dressel  (I960), 
who  claim  that  the  proliferation  of  objectives  is  a  difficulty  and  that  the 
need  is  to  have  objectives  of  special  significance  clearly  identified  and 
emphasized  among  science  teachers.  The  authors  are  concerned  in 
these  statements  with  different  aspects  of  the  same  problem,  and  a 
cursory  examination  of  their  statements  could  easily  be  misleading. 

It  is  true  that  a  broad  objective  needs  to  be  broken  down  into  detailed 
behavioral  outcomes  as  a  guide  to  implementation.  However,  it  is 
necessary  that  the  teacher  be  convinced  of  the  importance  of  a  few 
broad  aims  which  he  can  readily  keep  in  mind,  and  which  can  thus 
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provide  a  focus  for  his  thinking  and  practice.  If  he  is  thus  convinced 

\ 

he  will  find  increasing  need  for  detailed  analysis  of  these  objectives 
in  behavioral  terms,  and  at  that  stage,  reference  to  such  detailed 
statements  can  be  of  great  assistance  to  him.  It  is  suggested  that  the 
objectives  which  have  been  noted  above,  as  common  to  the  various 
statements,  are  sufficiently  general  and  important  to  serve  as  basic 
guides  for  the  teacher.  It  might  be  noted  here  that  there  is  a  good 
deal  of  overlap  between  these  objectives  and  those  stated  by  the 
Alberta  Department  of  Education  in  the  Course  Outlines  for  Senior 
High  School  Science  (1961). 

A  reason  sometimes  put  forward,  for  example  by  Burnett 
(1957),  for  the  discrepancy  between  theory  and  practice,  is  that  science 
examinations  used  in  the  schools  often  test  mainly  factual  knowledge. 
Both  Dressel  (I960),  and  the  British  Ministry  of  Education  Inspectors 
(I960)  made  the  point  that  this  tendency  to  concentrate  on  measuring 
memorized  facts  results  from  the  relative  ease  of  construction  of 
items  for  this  purpose  and  the  difficulty  of  testing  achievement  in  the 
non-factual  areas.  Dressel  (I960)  pointed  to  the  importance  of  giving 
attention  to  this  problem  when  he  said,  "since  one  of  the  axioms  of 
measurement  is  that  objectives  not  tested  in  examinations  are  not  real 
objectives  to  students,  it  behoves  every  teacher  to  include  items  in 
examinations  which  measure  accomplishment  of  all  the  real  objectives 
of  a  course.  "  (I960,  p.  59). 

In  view  of  the  acknowledged  difficulty  in  testing  the  non-factual 
objectives,  it  might  be  contended  that  suitable  testing  instruments  for 
these  purposes  should  be  made  available  to  teachers.  However,  there 
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are  few  such  instruments  commercially  available,  and  with  regard  to 
those  that  are  available,  and  the  kinds  of  items  suggested  in  textbooks 
for  the  purposes,  it  has  not  been  clearly  demonstrated  that  they  actu¬ 
ally  measure  what  they  purport  to  measure.  There  thus  emerges  the 
need  for  valid  instruments  which  are  readily  available  to  teachers. 

The  need  for  valid  tests  has  also  been  stressed  in  regard  to 
research  studies  in  science.  The  reviewers  in  the  Encyclopedia  of 
Educational  Research  (I960)  pointed  to  the  deficiencies  in  many  of  the 
studies  in  science  education  due  to  invalid  and  unrealistic  criterion 
measures  which  fail  to  sample  adequately  the  skills  or  understandings 
under  study.  Watson  and  Cooley  (I960)  also  made  this  point. 

Watson  and  Cooley  (I960)  also  raised  a  related  issue:  What 
abilities  are  measured  by  various  kinds  of  items,  and  what  is  the 
relationship  of  these  abilities  to  the  objectives  of  science  teaching? 

In  pointing  out  this  need  in  the  area  of  evaluation  they  said,  "more 
work  needs  to  be  done  on  the  intellectual  operations  required  by  various 
items;  and  directions  should  be  provided  to  show  how  those  operations 
are  related  to  the  desired  outcomes  of  science  instruction.  "  (I960, 

p.  305) 

It  was  the  general  purpose  of  the  investigation  to  be  reported 
here  to  assess  the  validity  of  certain  tests  purporting  to  measure  the 
extent  to  which  four  major  objectives  are  realized  in  science  teaching. 

The  Specific  Problem 

The  specific  problem  was  to  assess  the  validity  of  a  group  of 
tests  selected  as  measures  of  the  following  four  objectives: 
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1.  Knowledge  of  facts  and  principles. 

2.  Ability  in  problem- solving. 

3.  An  understanding  of  science  (for  example,  an  understand¬ 
ing  of  the  characteristics  of  scientists  as  people,  the  nature 
of  scientific  investigation,  and  the  relationship  of  science  to 
to  society). 

4.  Scientific  attitudes. 

The  investigation  was  carried  out  with  grade  X  students  attend¬ 
ing  classes  in  Science  X,  a  physical  science  course  designed  by  the 
Department  of  Education  in  Alberta..  A  sample  of  students  was  chosen 
to  be  representative  of  the  students  attending  this  course  in  the  public 
high  school  system  in  the  City  of  Edmonton. 

A  related  problem  was  to  assess  the  extent  to  which  Edmonton 
school  examinations  in  Science  X,  and  Departmental  examinations  in 
Science  IX  measure  the  chosen  objectives. 

A  subsidiary  problem  was  to  seek  evidence  of  a  special  scientific 
ability  factor  contributing  towards  performance  in  the  science  tests. 

Such  an  ability  has  been  predicted  though  there  is,  as  yet,  little  evidence 
for  its  existence.  This  problem  was  regarded  as  subsidiary  here  only 
because  the  project  was  not  primarily  designed  for  the  purpose  of  such 
i  de  ntif  i  c  ati  on. 

An  Overview 

In  order  to  assess  the  validity  of  the  tests  it  was  necessary  to 
find  out  what  the  tests  were  measuring  and  to  compare  this  with  the 
objectives.  A  technique  which  was  well -suited  to  this  task  was  factor 
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analysis.  This  numerical  method  would  isolate  the  statistical  vari¬ 
ables  which  best  account  for  the  differences  in  the  test  scores.  The 
problem  would  then  consist  in  interpreting  the  factors  and  deciding 
the  extent  to  which  they  corresponded  with  the  objectives  under  study. 
This  was  the  approach  taken. 

An  intensive  survey  of  the  literature  was  undertaken.  The 
survey  revealed  that  when  science  educators  speak  of  problem-solving 
they  imply  both  logical  reasoning  and  creative  activity.  The  evidence 
suggested  that  tests  of  the  reasoning  aspect  could  be  identified  by  their 
association  with  a  reasoning  factor  in  a  factor  analysis.  Tests  of 
recall  of  factual  knowledge,  on  the  other  hand,  should  be  related  mainly 
to  a  verbal  factor.  In  addition,  tests  measuring  the  affective  sets 
involved  in  attitudes  would  be  identified  by  factors  separate  from 
abilities.  Thus  a  basis  for  information  about  the  validity  of  tests  of 
three  of  the  objectives  was  laid.  There  was  little  evidence  about  the 
testing  of  understanding  science.  It  was  predicted  that  tests  of  this 
objective  would  be  related  to  verbal  and  reasoning  abilities  and  to 
attitudes  as  well.  However,  the  study  was  largely  exploratory  in  regard 

to  tests  of  this  objective. 
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Suitable  reference  variables  to  identify  different  ability  factors 
were  selected  as  well  as  some  of  the  best  available  measures  of  the 
four  objectives.  Scores  for  these  tests  were  gathered  by  administering 
them  to  the  sample.  Scores  for  local  school  and  Departmental  examina¬ 
tions  were  also  obtained.  The  assembled  data  was  then  analyzed.  The 
analysis  yielded  verbal,  reasoning,  and  attitude  factors  as  predicted 
and  the  tests  were  evaluated  in  terms  of  their  relationship  to  these. 
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The  following  chapters  present,  in  detail,  the  review  of  the  literature, 
the  resulting  theoretical  framework,  then  the  analysis  and  the  findings. 


CHAPTER  II 


A  REVIEW  OF  LITERATURE  RELATED  TO  THE 
INVESTIGATION 

Some  detail  has  been  given  in  Chapter  I  concerning  the  reasons 
for  the  choice  of  objectives  to  be  considered  in  this  investigation,  and 
the  references  to  the  literature  made  in  that  connection  will  not  be 
repeated  here.  The  present  chapter  will  be  concerned  with  a  review 
of  the  literature  covering  the  descriptions  of  these  objectives  and 
attempts  to  measure  them. 

Knowledge  as  an  Objective 

The  main  concern  regarding  knowledge  as  an  objective,  in  the 
present  context,  is  with  its  relation  to  the  other  objectives.  The 
question  of  what  knowledge  shall  form  the  content  for  science  courses 
is  one  of  the  most  important  to  occupy  the  minds  of  science  educators, 
as  is  evident,  for  example,  from  the  recent  developments  in  courses 
in  physics,  chemistry,  and  biology  in  the  United  States  of  America. 
However,  this  is  not  an  issue  which  is  undef  investigation  in  the  present 
study.  Rather,  it  is  realized  that  the  learning  of  skills,  understandings, 
and  attitudes  in  science  takes  place  in  the  context  of  a  certain  body  of 
knowledge,  and  is  dependent  on  concepts  learned.  In  measuring  skills 
and  understandings,  it  is  important  to  know  how  these  are  related  to 
the  processes  of  recall  which,  are  involved  in  measuring  knowledge  gained. 

In  discussing  levels  of  understanding,  Cronbach  (1'954)  said  that 
the  depth  of  understanding  depends,  in  part,  on  the  nature  of  the  material, 
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and  that  its  measurement  depends  on  the  kinds  of  items  used.  Thus, 
items  of  terminology  often  do  not  involve  reasons  for  the  names  used, 
so  that  rote  memory  alone  is  involved  and  testing  is  by  simple  recall. 
On  the  other  hand,  the  understanding  of  a  principle  reflects  the  facility 
with  which  the  subject  can  apply  it  to  solve  unfamiliar  problems. 
Cronbach  pointed  out  that  examinations  may  only  pose  problems  of  a 
kind  which  the  subject  has  solved  many  times,  but  that  when  a  problem 
is  unfamiliar,  the  subject's  application  of  the  principle  in  its  solution 
shows  greater  depth  of  understanding.  In  the  former  case  only  recall 
may  be  involved,  in  the  latter  reasoning  as  well. 

There  is  a  suggestion  in  discussions  such  as  those  of  Cronbach, 
that  recall  of  knowledge  depends  on  memory  alone,  and  that  reasoning 
involves  more  than  this.  It  is  relevant  therefore,  to  survey  some  of 
the  research  concerning  memory. 

Hilgard  (1957)  cited  experiments  which  showed  that  memory 
for  number  sequences  was  better  when  a  principle  of  organization  was 
understood,  than  when  the  sequences  were  learned  by  rote.  Another 
experiment  indicated  that  memory  of  application  of  principles  in 
college  biology  was  better  than  memory  for  terminology;  the  effect 
was  most  pronounced  in  regard  to  long  term  memory. 

The  exceptional  case  of  the  "idiot  savant"  might  suggest  that 
there  is  a  special  memory  faculty.  However,  the  evidence  regarding 
the  relation  of  memory  to  understanding  would  suggest  that,  in  general, 
there  would  be  a  relation  between  memory  and  intelligence.  Stroud 
(1956)  has  summarized  some  of  the  evidence  for  such  a  relation. 
Positive  correlations  have  been  found  in  numerous  investigations, 
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though  the  values  are  not  high;  they  range  from  about  0.  2  to  0.  4. 

Another  consideration  involving  the  knowledge  objective  and 
memory  stems  from  Thurstone  (1938)  who  identified  a  memory  factor 
in  his  factor  analytic  studies  and  included  it  among  his  "Primary 
Mental  Abilities.  "  The  tests  which  were  related  to  this  factor 
involved  pair  -  as  sociate  s  and  recognition  items.  Vernon  (1961)  has 
reviewed  the  evidence  for  memory  factors.  In  doing  so,  he  pointed 
out  that  Thurstone's  tests  do  not  bear  much  resemblance  to  learning, 
retention,  or  recall  in  everyday  life.  He  cited  other  evidence  to 
support  the  existence  of  a  rote  memory  factor,  but  noted  that  in  all 
these  cases  short  term  memory  is  involved.  In  an  investigation 
where  delayed  memory  was  tested,  the  tests  did  not  load  on  the  rote 
memory  factor,  but  on  the  verbal  factor.  When  meaningful  memory 
was  involved,  the  g  and  verbal  factors  were  the  ones  most  concerned. 

It  is  reasonable  to  conclude  from  this  evidence  that,  in  a 
factorial  investigation,  a  test  of  knowledge,  involving  simple  recall 
type  items  would  have  high  loadings  on  a  verbal  factor  if  such  a  factor 
is  obtained  in  the  analysis.  If  a  general  intellectual  ability  factor  is 
obtained  the  test  of  knowledge  probably  would  have  loading  on  this 
factor  also.  However,  if  the  analysis  yields  a  reasoning  and  a  verbal 
factor,  it  is  likely  that  the  test  would  have  small  or  zero  loadings  on 
the  reasoning  factor. 

Problem-solving  Ability 

It  used  to  be  common  for  science  educators  to  hold,  as  an 
objective,  that  students  should  be  trained  in  the  scientific  method. 


•- 


' 


. 

' 


Today,  the  development  of  problem-solving  ability,  is  the  objective 
which  has  generally  taken  the  place  of  the  former  statement.  To  a 
large  extent  this  seems  to  have  come  about  as  a  result  of  controversy 
concerning  the  scientific  method.  An  analysis  of  this  controvery  will 
be  helpful  in  gaining  perspective  for  thinking  about  problem-solving 
ability  as  an  objective  of  science  teaching. 

Dewey  (1933)  regarded  the  scientific  method  as  an  important 
concept.  He  was  greatly  impressed  with  the  results  of  thought 
processes  in  the  sciences  and  wanted  to  see  the  same  standards  of 
objectivity  and  verification  applied  to  thinking  in  everyday  life.  For 
him  the  scientific  method  "includes,  in  short,  all  the  processes  by 
which  the  observing  and  amassing  of  data  are  regulated  with  a  view  to 
facilitating  the  formation  of  explanatory  conceptions  and  theories.  " 
(1933,  p.  171).  The  processes  which  he  suggested  include  analysis, 
collection,  comparison,  and  experimental  variation.  Dewey  probably 
did  not  intend  to  convey  the  impression  that  there  is  one  formula  to  be 
followed  in  every  practical  situation.  He  was  concerned  to  stress  the 
importance  of  a  systematic  approach  in  which,  "a  perplexing,  confused, 
unsettled  situation  is  transformed  into  one  that  is  coherent,  clear, 
and  decided  or  settled.  "  (1933,  p.  165). 

Conant  has  taken  what,  at  first  sight  seems  to  be  an  extreme 
position  when  he  claimed  that  "there  is  no  such  thing  as  the  scientific 
method.  "  (1951,  p.  45).  He  argued  that  statements  of  the  scientific 

method  are  an  oversimplification  and  that,  "a  careful  examination  of 
physics,  chemistry  and  experimental  biology  fails  to  reveal  any  one 
method  by  means  of  which  the  masters  in  these  fields  broke  new 
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ground.  "  (1951,  p.  45). 

Keeslar  (1945)  attempted  an  empirical  approach  to  the  question. 
He  drew  up  a  list  of  elements  of  scientific  method  as  presented  in 
literature  on  the  subject,  and  used  these  in  a  questionaire  to  twenty 
two  research  scientists  of  the  University  of  Michigan  staff.  They 
rated  each  as  essential,  desirable,  or  unnecessary.  He  obtained  a 
list  of  ten  elements  such  as  sensing  a  problem,  defining  the  problem, 
selecting  the  most  likely  hypothesis,  testing  the  hypothesis  by  carry¬ 
ing  out  the  experiment  with  great  care  and  accuracy,  and  drawing  a 
conclusion.  He  concluded,  "the  elements  of  the  scientific  method 
are  definite,  distinct  from  scientific  attitude  and  known  and  used  by 
scientists.  "  (1945,  p.  273).  Keeslar  was  thus  led  to  a  position 
which  seemed  diametrically  opposed  to  that  of  Conant. 

Philosophers  of  science  have  also  discussed  this  question. 

Nagel  (1958)  said  that  there  is  one  scientific  method  in  the  sense  of, 
"the  way  in  which  statements  in  the  empirical  sciences  are  evaluated 
in  the  light  of  the  available  evidence  for  them.  "  (1958,  p.  148). 

Popper  (1959),  on  the  other  hand,  emphasized  the  role  of  creative 
intuition  in  scientific  discovery,  and  claimed  that,  "there  is  no  such 
thing  as  a  logical  method  of  having  new  ideas,  or  a  logical  recon¬ 
struction  of  this  process.  "  (1959,  p.  32).  Bridgman  (1950)  presented 

his  point  of  view  as  follows:  "I  like  to  say  that  there  is  no  scientific 
method  as  such,  but  rather  only  the  free  and  utmost  use  of  intelligence. 
(1950,  p.  278).  He  claimed  that  any  apparently  unique  characteristics 
of  method  are  to  be  explained  by  the  nature  of  the  subject  matter, 
rather  than  by  the  method  itself. 
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An  examination  of  the  arguments  indicates  that  the  differences 

are  not  so  great  as  may  seem  apparent  at  first  sight.  Some  of  the 

authors  stressed  both  the  significance  of  the  analytical  procedures 

which  have  been  going  on  before  the  problem  emerges  clearly  and 

the  processes  of  verifying  hypotheses  once  formulated.  Others  were 

concerned  to  point  out  that  there  is  more  to  scientific  discovery  than 

logical  analysis  and  empirical  testing.  Hebb  (1958)  drew  a  clear 

distinction  between  "discovery,  "  as  "the  attaining  of  new  ideas,  "  and 

"verification,  "  as  the  process  of  testing,  clarifying  and  systematizing 

them."  (1958,  p.  213).  This  distinction  between  the  creative  aspects 

and  verification  seems  to  resolve  the  conflict  in  large  measure.  It  is 

true  that  some  famous  scientists  and  mathematicians  have  gained  the 

solution  to  a  problem  in  a  flash  of  insight,  but  this  has  generally 

come  about  after  the  investigator  has  immersed  himself  thoroughly  in 

the  problem  pursuing  logical  thought  to  considerable  lengths.  It  is 

true  too,  that  a  brilliant  hypothesis  has  been  formed  sometimes  on  the 

basis  of  a  single  example  of  a  phenomenon,  but  this  has  been  followed 

up  by  inductive  procedures  where  investigation  of  many  cases  leads  to 

generalization  in  order  to  verify  the  hypothesis.  Thus,  logic  and 

creative  activity  both  play  their  part. 

It  would  seem  that  when  the  development  of  problem-solving 

ability  is  stated  as  an  objective  of  science  teaching,  it  is  intended  that 
* 

both  creative  activity  and  logical  analysis  are  involved.  The  emphasis 
placed  on  techniques  of  teaching  which  give  the  student  opportunities  to 
solve  "genuine  problems,  "  as  discussed  by  Mills  and  Dean  (I960),  for 
example,  would  support  this  contention.  Brandwein  et.  al.  (1958) 
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distinguished  between  "problem-solving,  "  in  which  the  very  statement 
of  the  problem  is  a  creative  act,  and  "problem-doing,  "  in  which  a 
stated  problem  is  the  starting  point.  They  believed  that  both  "problem¬ 
solving"  and  "problem-doing"  have  a  place  in  science  teaching;  presum¬ 
ably  the  latter  helps  to  develop  skill  in  the  former. 

Sometimes  psychologists  have  found  it  helpful  to  consider 
problem-solving  apart  from  creative  thinking.  '  For  example,  Vinacke 
(1952)  dealt  with  problem-solving  and  imaginative  thinking  separately, 
and  considered  creative  thinking  to  be  intermediate  between  these  two. 
He  suggested  that  imaginative  thinking  occurs  primarily  in  relation  to 
inner  needs,  whereas  problem-solving  occurs  primarily  in  relation  to 
the  external  world.  For  him  creative  thought,  "seems  to  be  inter¬ 
mediate  between  problem-solving  and  imagination,  occurring  in  special 
situations  involving  nearly  indistingui shably  problem-solving  behavior 
and  imagination.  "  (1952,  p.  160).  In  problem-solving  Vinacke  confined 

attention  to  the  behavior  which  results  when  a  problem  has  been  recog¬ 
nized  and  the  subject  proceeds  towards  the  solution.  In  a  similar  way 
Bartlett  (1958)  spoke  of  "thinking  within  closed  systems"  and  "adventur¬ 
ous  thinking"  (1958,  p.  75).  Others  such  as  Osgood  (1953)  have  not 
made  such  a  distinction.  In  any  case,  it  will  be  helpful  to  consider 
some  of  the  evidence  presented  by  psychologists  relating  to  the  pro¬ 
cesses  involved  in  problem-solving. 

Wertheimer  has  been  quite  an  influential  figure  in  this  regard. 
Though  he  worked  in  the  earlier  part  of  this  century,  his  book, 
Productive  Thinking  was  not  published  until  1945,  after  his  death. 


Some  of  his  studies,  such  as  those  concerning  children's  thinking  in 
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relation  to  the  parallelogram  problem  are  well  known.  He  laid  stress 
upon  an  understanding  of  the  structural  and  functional  relationships  of 
the  situation  in  order  that  insight  might  take  place.  His  Gestalt  posi¬ 
tion  is  also  evident  in  his  emphasis  upon  an  orientation  to  the  whole 
problem,  rather  than  undue  emphasis  upon  the  parts,  so  that  the  sub¬ 
ject  could  profitably  fill  in  the  gaps.  He  regarded  problem-solving  as 
a  dynamic,  fluid  process,  with  the  subject  "centering"  his  thinking  on 
the  essential  aspect  of  the  problem  (1959,  p.  181),  and  then  "recenter¬ 
ing"  it  as  he  progressed  towards  solution.  Another  notable  figure  was 
Duncker,  (in  Chaplin  and  Krawiec,  I960)  whose  investigations  led  him 
to  emphasize  the  importance  of  reformulation  of  the  problem  (similar 
to  Wertheimer’s  recentering),  functional  solutions,  and  finally  specific 
solutions.  The  experiments  of  Luchins,  as  cited  by  Hilgard  (1957), 
seem  to  be  relevant  to  the  work  of  Wertheimer  and  of  Duncker.  He 
showed  that  inefficiency  in  problem-solving  could  result  from  the 
persistence  of  a  set  that  is  no  longer  profitable  in  a  particular  situa¬ 
tion.  This  may  be  described  as  rigidity  in  thinking  as  opposed  to  the 
flexibility  evident  in  "recentering"  and  "reformulating.  "  It  was 
possible  that  in  an  analysis  involving  problem-solving  and  attitude 
variables,  such  as  the  present  one,  this  rigidity-flexibility  dimension 
might  appear.  In  fact,  it  did  not  appear  in  the  study. 

The  work  of  Wertheimer  and  Duncker  suggested  the  useful¬ 
ness  of  a  kind  of  case  study  approach  to  problem-solving.  In  a  more 
developed  form  this  involves  the  subject  "thinking  aloud"  as  he  works 
towards  a  solution,  introspection  by  the  subject,  and  observation  by 
the  investigator.  Bloom  and  Broder  (1950)  used  this  method,  largely 
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with  a  view  to  providing  a  basis  for  remedial  work  with  college  stud¬ 
ents.  One  of  their  findings  was  that  the  student's  attitude  toward  the 
solution  of  problems  was  important;  he  needed  to  have  a  desirable 
attitude  toward  reasoning  processes,  and  to  have  confidence  in  his 
ability  to  solve  the  problems.  These  findings  too,  had  relevance  to 
the  present  study. 

A  factor -analytic  study  of  problem-solving  abilities  carried 
out  by  Guilford,  has  been  summarized  by  Hunt  (1961).  It  was  concluded 
that  a  number  of  different  phases  in  the  problem-solving  process  can 
be  identified;  these  were:  "the  analysis  of  the  difficulty,  "  "production,  " 
"verification,  "  and  "reapplication.  " 

Bruner's  (1956)  treatment  of  "concept  attainment"  has  much 
in  common  with  the  work  here  reviewed  on  problem-solving,  and 
indeed,  he  regarded  many  aspects  of  "concept  attainment"  as  problem¬ 
solving  behavior.  Whilst  the  science  teacher  has  as  his  objective  the 
development  of  skill  in  problem-solving,  the  context  in  which  this 
takes  place  in  the  classroom  is  largely  that  of  "concept  attainment.  " 

Vinacke  (1952)  reported  evidence  that,  though  the  correlation 
of  problem-solving  ability  with  intelligence  test  scores  was  positive, 
it  was  lower  than  would  be  expected.  He  raised  questions  as  to 
whether  this  was  due  to  attitude  factors  or  whether  it  was  due  to  the 
content  of  the  intelligence  tests.  It  would  be  understandable,  in  part 
at  least,  if  it  were  assumed  that  the  overlap  was  due  to  what  would  be 
called  the  reasoning  aspects  of  the  intelligence  tests  and  the  difference 
largely  due  to  other  abilities. 

There  has  also  been  a  good  deal  of  effort  expended  in  attempting 
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to  clarify  the  nature  of  reasoning.  Thurstone's  "Primary  Mental 
Abilities"  (1938)  included  Inductive  and  Deductive  Reasoning  factors. 
Green,  Guilford,  and  others  (1953)  carried  out  a  factor  analysis  of 
32  different  tests  with  air  force  personnel,  and  identified  seven 
reasoning  factors.  One  of  these  seemed  to  be  the  same  as  Thurstone's 
deductive  reasoning  factor;  three  of  them  appeared  to  represent  a 
breakdown  of  the  inductive  factor,  whilst  another  was  called  a  general 
reasoning  factor.  Vernon  (1961)  reported  a  large-scale  investigation 
by  Lyerly  and  Adkins  to  elucidate  the  nature  of  reasoning,  but  only  five 
out  of  the  1  6  factors  extracted  were  related  to  reasoning.  There  was 
some  common  ground  with  the  investigation  of  Green  et.  al.  but 
Vernon  concluded  that  the  study  confused  the  issue  rather  than  clari¬ 
fying  it.  Vernon  suggested  that  when  the  g  factor  is  extracted,  addi¬ 
tional  reasoning  group  factors  are  small  and  this  simplifies  the  pic¬ 
ture.  It  is  evident  that  the  practical  problem  here  concerns  the  nature 
of  the  group  with  which  one  is  working.  If  the  group  consists  of  high 
ability  personnel,  such  as  those  in  the  studies  of  Guildord,  then  indi¬ 
vidual  differences  could  be  determined  by  fine  distinctions  within  the 
reasoning  domain.  If  however,  the  group  is  heterogeneous  then  more 
gross  structure  would  determine  the  differences.  Rust's  investigations 
(I960,  1962)  of  critical  thinking  tests  have  indicated  that  in  the  instru¬ 
ments  used  the  subtests  designed  to  measure  distinct  abilities  within 
the  reasoning  domain  do  not  provide  a  reliable  means  of  measurement 
if  such  distinct  abilities  exist. 

There  has  been  an  increasing  interest  in  creative  thinking, 
including  studies  with  gifted  children  such  as  the  one  recently  reported 
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by  Getzels  and  Jackson  (1962).  It  has  been  suggested  in  the  present 
review  that  science  educators  have  creativity  in  mind  when  they  speak 
of  problem-solving  ability.  Getzels  and  Jackson’s  report  indicated 
positive  correlations  between  intelligence  test  scores  and  creativity 
as  measured  by  their  tests,  but  the  correlation  is  not  high.  They 
also  reported  positive  correlations  between  school  achievement  and 
tests  of  creativity,  but  again  the  values  were  not  high.  They  suggested 
differences  in  attitudes  and  values  as  contributing  to  the  lowness  of 
these  correlations  as  compared  with  what  one  might  expect.  They 
also  pointed  to  differences  in  the  tasks  involved  in  the  tests.  The 
latter  difference  corresponds  partly  to  Guilford's  distinction  between 
"convergent"  and  "divergent"  thinking  factors  (1959,  p.  360).  "Con¬ 
vergent"  thinking  is  involved  when  the  subject  proceeds  towards  one 
correct  answer,  whereas  "divergent"  thinking  is  defined  as  the  kind 
which  goes  off  in  different  directions.  Guilford  listed  some  1  lfactors 
in  the  "divergent"  thinking  area,  and  expected  that  more  will  be 
discovered.  It  should  be  noted  again  that  Guilford  has  been  working 
with  highly  selected  subjects,  and  his  diversity  of  factors  may  not 
be  especially  helpful  in  the  ordinary  classroom.  Recent  research 
with  English  grammar  school  pupils  by  Sulthn  (1962)  tended  to  bear 
this  out.  He  administered  a  large  battery  of  tests,  many  based  on 
those  of  Guilford,  and  found  that  g,  verbal,  and  spatial  factors 
accounted  for  much,  of  the  variance.  He  did  find  that  fluency  and 
originality  factors  were  present. 

The  Sequential  Tests  of  Educational  Progress  (STEP) 


Science  Tests  (1  957)  seem  to  have  been  designed  as  measures  of 
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problem-solving  ability  in  science,  especially  when  this  is  thought  of 
as  the  behavior  involved  in  reasoning  to  the  solution  of  a  problem. 

The  manual  says,  "the  STEP  Science  tests  were  designed  to  measure 
ability  to  use  scientific  knowledge  to  solve  problems,  "  and  lists  as 
"the  most  important  types  of  scientific  reasoning"  the  following: 

"Ability  to  identify  ana  define  scientific  problems,  "  Ability  to  suggest 
or  screen  hypotheses,  "  "Ability  to  select  valid  procedures,  "  "Ability 
to  interpret  data  and  draw  conclusions,  "  "Ability  to  evaluate  critically 
claims  or  statements  made  by  others,  "  and  "Ability  to  reason  quanti¬ 
tatively  and  symbolically.  "  (1957,  p.  7).  Since  the  beginning  of  the 

present  investigation,  a  study  of  the  significance  of  the  format  of  the 
STEP  Science  tests  in  relation  to  test  scores  has  been  published  by 
Gega  and  Karlsen  (1962).  They  suggested  that  the  unique  thing  about 
these  tests  was  the  format,  in  which  various  environmental  situations 
are  put  forward  and  each  one  is  followed  by  a  series  of  application- 
type  items  pertinent  to  the  situation.  Working  with  fifth  ana  sixth 
grade  children,  they  compared  the  standard  test  with  a  version  of  it 
in  which  the  items  were  made  self-contained,  and  the  situational  format 
was  eliminated.  They  found  no  significant  differences  in  the  means 
and  standard  deviations  of  the  two  sets  of  scores.  They  suggested,  as 
a  result  of  other  findings,  that  reading  ability  played  a  considerable 
part  in  performance  on  the  STEP  test  at  the  grade  level  at  which  the 
investigation  was  conducted.  It  is  interesting  to  note  however,  that 
the  content  of  the  experimental  form  of  the  test  was  only  three-quarters 
as  long  (in  terms  of  words  used)  as  the  standard  form,  and  yet  there 
was  no  significant  difference  in  the  means  and  standard  deviations. 
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J.t  is  true  that  the  format  is  unique.  However,  the  present  writer 
was  also  impressed  with  the  way  in  which  the  items  require  the 
subject  to  apply  his  knowledge  in  new  situations. 

A  test  which  has  been  praised  by  Dressel  in  the  Fifty-Ninth 
Yearbook  of  the  N.  S,  S.  E.  (I960)  is  Burmester's  Test  of  Aspects  of 
Scientific  Thinking.  This  test  did  not  presuppose  mastery  of  science 
content  or  technical  language.  Scores  obtained  on  the  test  by  students 
entering  Michigan  State  University  were  found  to  correlate  0.  64  with 
grades  in  the  Freshman  course  in  natural  science.  The  test  was 
designed  for  use  with  late  high  school  and  first  year  college  students 
and  did  not  seem  suitable  for  the  present  investigation  because  of  this. 

Nelson  (1958)  has  offered  many  suggestions  of  kinds  of  items 
v/hich  might  be  useful  as  measures  of  problem-solving.  He  has  organ¬ 
ized  these  around  themes  such  as  recognizing  and  appraising  assump¬ 
tions,  evaluating  hypotheses,  and  analyzing  experiments. 

An  Understanding  of  Science 

An  understanding  of  science  is  an  objective  which  has  become 
prominent  in  the  last  decade  or  so,  but  there  is  little  mention  in  the 
literature  of  its  measurement  as  such.  Conant  (1959)  has  stressed 
its  importance  in  the  following  words:  "we  need  a  widespread  under¬ 
standing  of  science  in  this  country,  for  only  thus  can  science  be 
assimilated  into  our  secular  cultural  pattern.  "  (1959,  p.  19).  Included 

in  his  many  helpful  suggestions  about  the  important  components  of  this 
understanding  are  the  role  of  experiment,  the  relationship  of  science 
and  technology,  the  dynamic  quality  of  scientific  knowledge,  the  atti¬ 
tude  of  the  scientist  to  his  work,  and  the  relationship  of  science  to 


. 

-  ( ; 

-/’i  O':.-  .  '.I.. 


21 


society. 

The  Science  Manpower  Project  Fellows  at  Columbia  University 
have  pointed  out  that  the  number  of  students  entering  scientific 
professions  is  related  to  their  attitude sltowards  science,  and  that 
"constructive  attitudes  toward  science  requires  a  comprehensive 
understanding  of  the  nature  of  science  and  of  scientific  work.  "  (1959, 

pp.  38-39).  In  1957  the  Project  sponsored  an  investigation  of  attitude 
toward  science  and  scientists  held  by  senior  high  school  students  in 
New  Jersey.  The  test  measured  understanding  of  science  as  a 
determiner  of  attitudes.  It  was  concluded  that  the  students  did  possess 
attitudes  favorable  to  science  and  the  scientific  endeavor.  Concern 
was  expressed,  nevertheless,  about  the  considerable  amount  of  mis¬ 
understanding  and  ignorance  of  the  nature  of  science  that  was  revealed. 
Their  finding  regarding  the  attitude  to  science  and  the  scientific  t 
endeavor  appears  to  disagree  with  the  findings  of  Mead  and  Metraux 
(1957).  The  latter  investigators  concluded  that  the  majority  of  high 
school  students  in  their  study  had  an  unfavorable  image  of  the  scientist. 
Brandwein  (1959)  presented  further  evidence  which  tends  to  support 
that  of  the  Science  Manpower  Project. 

An  experimental  test  called  The  Facts  About  Science  Test 
(1958),  was  published  by  the  Educational  Testing  Service  for  measure¬ 
ment  of  an  understanding  of  science.  Very  little  statistical  data  were 
available  at  the  time,  and  such  data  have  not  been  forthcoming.  More 
recently,  Cooley  and  Klopfer  of  the  Harvard  Graduate  School  of  Edu¬ 
cation  have  worked  in  cooperation  with  the  Educational  Testing 
Service,  to  produce  the  "Test  of  Understanding  Science"  (1961). 
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According  to  the  manual  (1961)  three  areas  are  covered,  "Understand¬ 
ings  about  the  scientific  enterprise,  "  "Understandings  about  scien¬ 
tists"  and  "Understandings  about  the  methods  and  aims  of  science.  " 
(1961,  p.  3).  The  test  seems  to  be  the  result  of  careful  production, 
but  there  is  no  independent  evidence  relating  to  it  at  the  present 
time.  Its  authors  are  using  it  in  a  five-year  investigation  in  which 
they  are  attempting  to  identify  measurable  predictors  for  entrance 
into  scientific  careers.  It  has  been  used  in  the  present  investiga¬ 
tion,  and  technical  details  about  it  are  included  in  Chapter  IV. 

Scientific  Attitudes 

Probably  the  earliest  significant  work  in  defining  and  measur¬ 
ing  scientific  attitude  was  that  of  Curtis  in  the  1920's.  The  original 
references  for  the  work  of  Curtis  were  not  available  to  the  investi¬ 
gator,  but  Curtis’  concept  of  scientific  attitude  was  described  by 
Hoff  (1936).  He  isolated  five  main  components:  conviction  of 
universal  basic  cause  and  effect  relationships,  sensitive  curiosity, 
habit  of  delayed  response,  habit  of  weighing  evidence,  and  open- 
mindedness.  His  test  was  designed  to  measure  scientific  attitude 
in  terms  of  these  components. 

In  the  next  decade  Noll  devoted  a  good  deal  of  attention  to 
this  question,  and  his  test  was  published  by  the  Bureau  of  Publica¬ 
tions,  Teachers'  College,  Columbia  University  (1934-1935).  Noll 
based  his  test  on  the  following  elements  of  the  scientific  attitude: 
accuracy,  intellectual  honesty,  open-mindedness,  suspended  judg¬ 
ment,  looking  for  true  cause  and  effect  relationships,  and  critical¬ 
ness.  Recently  Kahn  (1962),  after  surveying  the  literature, 
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concluded  that  the  two  tests  just  mentioned,  those  of  Curtis  and  Noll, 
were  the  most  satisfactory  of  those  previously  constructed,  and  used 
these  in  an  experimental  study  of  the  use  of  current  events  in  science 
to  teach  scientific  attitudes  to  a  group  of  seventh  and  eighth  grade 
boys.  Blair  (1940)  criticized  the  validity  of  Noll's  test.  He  obtained 
the  responses  of  sixteen  scientists  in  various  branches  of  science  on 
the  Faculty  of  the  University  of  Illinois.  He  found  wide  discrepancies 
with  the  responses  suggested  by  the  key,  and  decided  that  twenty  six 
items  in  form  I,  and  twenty  five  items  in  form  II  (from  a  total  of  112 
items  in  each  case)  were  invalid,  because  less  than  three-quarters 
of  the  scientists  were  in  agreement  on  the  appropriate  responses. 

It  seems  to  the  writer  that  the  educator  is  concerned  about  the  transfer 
of  the  scientific  attitudes  to  behavior  in  everyday  life.  In  that  case 
scientists  are  not  in  a  preferred  position  in  forming  a  validation 
group,  and  such  a  group  of  scientists  can  be  expected  to  show  a  varia¬ 
tion  in  the  opinions  which  are  held  by  its  members.  However,  though 
such  a  distribution  of  opinion  of  judges  is  taken  into  account  in  con¬ 
structing  attitude  scales  by  some  methods,  it  did  not  find  a  satisfactory 
place  in  Noll’s  test  where  the  items  were  scored  true  or  false,  without 
any  determination  of  scale  values. 

Crowell  (1937)  carried  out  an  extensive  survey  and  drew  up  a 
list  of  twenty  nine  attitudes  which  seemed  important  in  science,  and 
had  these  evaluated  by  sixty  four  judges.  The  most  important  ones 
were  open-mindedness,  carefulness  and  accuracy,  impartiality, 
belief  in  cause  and  effect  relationships,  and  suspended  judgment. 

Heiss  (1958)  identified  the  elements  of  scientific  attitude 
which  seemed  to  be  common  to  most  discussions  of  the  subject  as: 
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"curiosity;  freedom  from  bias,  prejudice  and  superstitions;  open- 
mindedness;  critical  mindedness;  intellectual  honesty;  belief  in  cause 
and  effect;  and  willingness  to  change  beliefs  when  new  evidence  is 
found.  "  (1958,  p.  371). 

An  assessment  of  scientific  attitude  was  included  by  Anderson 
(1950)  in  his  evaluation  of  science  teaching  in  Minnesota  schools. 

Though  his  study  was  regarded  as  an  outstanding  one  by  the  reviewers 
in  the  Encyclopedia  of  Educational  Research  (I960),  no  details  of  his 
tests  were  given  either  in  the  review  or  in  his  published  report  (1950). 

It  is  possible  that  the  praise  for  his  work  was  prompted  by  the  thorough¬ 
ness  of  his  statistical  treatment,  and  by  the  fact  that  his  evaluation  did 
relate  to  several  objectives,  more  than  by  an  analysis  of  the  instruments 
he  used. 

In  examining  the  information  which  is  available  about  the  tests 
which  have  been  produced  in  this  area,  one  notice’s  that' often  in  the  one 
test  an  attempt  is  made  to  measure  a  number  of  different  attitudes  and 
to  regard  the  score  on  the  test  as  a  measure  of  the  scientific  attitude  of 
the  individual.  Skewes  (1933)  suggested  that  it  would  be  better  to  use 
several  different  tests,  and  give  scores  for  different  scientific  attitudes. 
This  seems  to  be  a  sound  suggestion.  Tests  like  that  of  Noll  purport  to 
measure  the  scientific  attitude,  but  assume  that  this  is  made  up  of 
several  components.  There  is  no  indication  of  a  method  of  weighting 
the  scores  in  the  elements  so  as  to  give  the  total  score,  and  it  is  not 
easy  to  see  how  this  could  be  done.  A  more  defensible  position  would 
be  to  measure  the  component  attitudes  separately,  using  one  of  the 
recognized  methods  of  attitude  scale  construction.  The  latter  are  based 
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on  a  uni  dimensional  model,  that  is,  one  in  which  one  attitude  is  being 
measured.  Alternatively,  it  may  be  possible  to  build  up  one  scale  using 
factor  analysis  to  identify  those  statements  which  contribute  predomin¬ 
antly  to  one  general  factor  in  such  a  way  that  this  general  factor  corre¬ 
sponds  with  what  is  meant  by  scientific  attitude.  In  this  investigation 
separate  measures  for  the  different  component  attitudes  were  used. 

The  study  did  not  give  evidence  that  a  general  factor  exists  which  could 
be  called  the  scientific  attitude. 

Attempts  to  measure  individually  the  attitudes  which  have  been 
identified  as  components  of  scientific  attitude,  or  which  might  be  called 
scientific  attitudes,  do  not  seem  to  have  been  very  numerous.  One 
thorough  investigation  of  open-mindedness  has  been  carried  out  by 
Rokeach  (I960).  Beginning  with  an  analysis  of  ideological  dogmatism, 
Rokeach  and  coworkers  carried  out  extensive  research  into  the  nature 
of  belief  systems.  The  basic  characteristic  which  defines  the  extent  to 
which  a  person’s  belief  system  is  open  or  closed  is, 

the  extent  to  which  the  person  can  receive,  evaluate,  and  act 
on  relevant  information  received  from  the  outside  on  its  own 
intrinsic  merits,  unencumbered  by  irrelevant  factors  in  the 
situation  arising  from  within  the  person  or  from  the  outside. 
(I960,  p.  57). 

In  the  course  of  the  research,  a  scale  was  constructed  which  attempted 
to  measure  this  openness  or  closedness  of  a  person's  belief  systems, 
and  this  dogmatism  scale  was  used  in  the  present  investigation. 

The  1940  Mental  Measurement  Yearbook  contained  a  review  of 
a  test  to  measure  understanding  of  cause  and  effect  relationships  in 
science.  It  was  produced  by  the  State  Science  Committee  of  the 
Wisconsin  Education  Association.  This  reviewer  was  quite  critical  of 
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the  test  on  the  grounds  that  considerable  differences  of  opinion  were 
possible  about  what  judgments  from  students  should  be  regarded  as 
acceptable. 

Some  attention  has  been  paid  to  the  prevalence  of  superstitious 
beliefs,  especially  amongst  secondary  school  students,  and  the  relation 
of  such  beliefs  to  variables  like  age,  intelligence,  and  emotional  adjust- 
ment.  Measurement  has  generally  been  attempted  by  preparing  lists  of 
superstitious  beliefs  and  having  the  subjects  respond  to  these  in  some 
way  which  will  indicate  the  influence  of  the  beliefs  upon  them. 

Mailer  and  Lundeen  (1  933)  investigated  the  sources  of  common 
superstitious  beliefs  among  students  with  a  scale  of  50  statements.  The 
same  authors  (1934)  later  investigated  the  relationship  of  such  beliefs  to 
emotional  maladjustment,  as  measured  by  an  objective  test  among 
seventh  grade  students,  and  found  a  correlation  of  0.  55.  In  a  similar 
investigation  Zapf  (1945,  a)  found  a  correlation  of  0.  11.  She  reported  a 
correlation  of  -0.  20  with  intelligence,  and  found  that  girls  were  more 
superstitious  than  boys,  though,  no  coefficient  was  given.  Zapf  (1939) 
had  earlier  reported  finding  that  the  mere  teaching  of  science  had  no 
effect  on  students'  beliefs  in  superstitions  unless  instruction  dealt  with 
the  specific  ideas.  The  same  author  (1945,  b)  attempted  to  validate  the 
test  used  by  comparing  the  verbal  expressions  of  belief  in  certain  super¬ 
stitions  with  behavior  in  laboratory  situations  involving  the  practice  of 
these  beliefs.  A  correlation  of  0.  79  was  obtained.  The  particular 
laboratory  tests  seemed  rather  artificial  however,  and  it  is  not  at  all 
certain  that  the  laboratory  behavior  was  a  true  indication  of  what  the 
subject  would  do  in  ordinary  life. 
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Though  curiosity  has  been  mentioned  often  by  educators  in  the 
discussions  of  scientific  attitudes,  and  exploratory  behavior  has  been  the 
subject  of  much  study  with  animals,  little  empirical  investigation  of  curio¬ 
sity  has  been  reported  in  the  case  of  human  beings.  Berlyne  (I960)  has 
put  forward  a  behavioral  theory  of  curiosity.  He  distinguished  between 
"epistemic  curiosity,  "  which  he  described  as  a  drive  reducible  by  know¬ 
ledge-rehearsal,  and  "perceptual  curiosity,  "  which  is  the  kind  of  curiosity 
found  in  the  lower  animals,  that  leads  to  increased  perception  of  stimuli. 

An  exploratory  experiment,  designed  to  test  some  of  the  predictions  from 
the  theory  was  reported  by  Berlyne.  Maw  and  Maw  (1961),  noting  the 
dearth  of  instruments  for  measuring  human  curiosity,  began  a  study  which 
was  intended  to  produce  an  instrument  for  use  with  elementary  school 
children.  The  reports  so  far  only  cover  the  setting  up  of  high  and  low 
groups  on  curiosity  for  validation  purposes.  For  the  groups  set  up  on  the 
basis  of  teacher  and  peer  judgments,  there  was  a  definite  correlation  with 
intelligence.  The  authors  made  a  further  selection  so  as  to  obtain  high  and 
low  groups  independent  of  intelligence.  It  is  interesting  to  note  that  the 
correlations  of  intelligence  with  teacher  judgments  of  curiosity,  were 
generally  higher  than  the  corresponding  correlations  with  peer  judgments. 

One  of  the  various  possible  explanations  for  this  is  that  teachers  tend  to 
associate  curiosity  with  intelligence,  and  their  ratings  are  biased  accordingly. 

One  of  the  problems  as  sociated  with  attitude  measurement  is  ensuring 
that  the  attitude  expressed  is  in  agreement  with  the  attitudes  evidenced  by  the  sub¬ 
ject  in  action.  Because  of  this  there  has  been  interest  in  indirect  methods  of  mea¬ 
surement,  ma.ny  of  which  disguise  the  purpose  of  the  test.  Numbered  amongst 
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these  instruments  are  projective  techniques,  information  tests,  esti¬ 
mation  of  group  opinion  and  tests  employing  bias  in  perception  and 
memory.  In  an  extensive  review  of  such  techniques,  Campbell  (1950) 
concluded  that  the  evidence  did  not  lend  support  to  claims  that  these 
tests  have  higher  validity  than  direct  tests. 

Factor  Analysis  Studies  Relating  to  Ability  and  Achievement  in  Science 

Several  factor  analytic  studies  conducted  in  the  field  of  science 
education  in  England  have  been  surveyed  by  Peel  (1955).  Most  of  these 
were  concerned  to  investigate  the  problem  of  the  existence  of  a  group 
factor  for  ability  in  science.  Those  of  Berridge,  Jog,  Khan,  and  Angus 
will  be  mentioned  here.  Berridge  used  a  battery  of  verbal,  reasoning, 
memory,  spatial  and  practical  tests  and  included  school  marks  in  the 
heat,  mechanics,  and  hydrostatics  portions  of  a  school  physics  course. 
He  found  that  the  "unrotated  g  factor"  gave  highest  loadings  in  hydro¬ 
statics  and  mechanics  and  lowest  loadings  in  a  mechanical  test.  By 
rotation  of  the  original  centroid  factors,  he  identified  factors  which  he 
called  abstract  reasoning,  memory,  use  of  worlds,  number,  induction, 
and  spatial,  reasoning.  Induction  and  reasoning  (spatial,  abstract,  and 
verbal)  correlated  most  highly  with  school  physics.  The  science  tests 
in  this  battery  were  restricted  in  nature  and  were  a  small  part  of  the 
total  number  of  tests. 

Jog  also  confined  his  attention  to  one  field  in  science.  He 
analyzed  scores  from  verbal,  non-verbal,  spatial,  and  practical  tests 
of  ability,  together  with  attainment  tests  in  arithmetic,  algebra,  and 
physics,  and  tests  of  persistence  for  grammar  school  students.  He 
identified  only  three  significant  factors  affecting  attainment  in  physics; 
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general  intelligence,  visuo-mechanica.1  ability,  and  industry  (persist¬ 
ence). 

The  analysis  by  Khan  is  of  interest  because  some  of  the  tests 
he  used  are  similar  to  tests  suggested  as  measures  of  critical  thinking 
and  problem-solving  skills.  He  included  tests  of  accuracy  of  observa¬ 
tion,  definition,  classification,  interpretation,  application,  generaliza¬ 
tion,  planning  of  experiments,  and  resourcefulness  in  science.  Work¬ 
ing  at  the  secondary  school  level,  he  isolated  three  factors,  called 
general  intellective  power,  verbal  reasoning,  and  visual  imagery. 

From  a  mixed  battery  of  attainment,  ability,  and  interest  tests 
in  science,  Angus  obtained  a  clear  group  factor  for  interest. 

Evidence  for  a  scientific  group  factor  in  school'  attainment  was 
contained  in  the  report  of  a  study  by  Lewis  (1961).  Lewis  analyzed 
achievement  scores  in  eight  school  subjects,  English,  Latin,  French, 
History,  Geography,  Arithmetic,  Algebra,  and  Geometry  along  with 
specially  constructed  tests  in  Physics,  Chemistry,  and  Biology  for  173 
boys  in  Belfast  grammar  schools.  He  constructed  the  tests  in  Physics, 
Chemistry,  and  Biology  because  science  had  been  taught  as  a  combined 
subject.  He  found  a  general  intellectual  ability  factor  to  be  the  most 
significant  one,  but  also  found  a  group  factor  determined  by  the  tests  in 
the  three  sciences. 

In  the  United  States  Wolins,  Mackinney,  and  Stephans  (1961) 
administered  a  group  of  standardized  achievfement  tests  (the  Cooperative 
and  California.  Achievement  Tests  in  Science,  Nelson  Biology,  Anderson 
Chemistry,  Cooperative  English,  and  Cooperative  American  History) 
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students  in  Chicago.  The  results  were  correlated  with  school  grades 
and  with  sex.  The  study  was  carried  out  in  three  sections  since 
students  were  enrolled  in  Physics,  Chemistry,  or  Biology.  The  numbers 
of  students  were  not  large  (Physics  8  7,  Chemistry  7  5,  and  Biology  119). 
Three  significant  factors  were  isolated.  These  were  called  "general 
intellectual  ability,  "  "male  interest-achievement,  "  and  "specific  achieve¬ 
ment"  (that  is  specific  to  the  different  sciences).  ‘The  "male  interest- 
achievement"  factor  had  high  loadings  from  sex  and  interest,  but  the 
loadings  of  the  achievement  tests  were  all  quite  low  except  in  one  case. 
The  authors  suggested  that  the  Kuder  Scientific  Scale  was  more  a 
measure  of  masculinity-feminity  that  of  science  interest.  Two  points 
might  be  noted  in  this  regard.  Firstly,  the  work  of  Meyer  and  Penfold 
(1961)  and  others  has  shown  that  there  is  a  positive  correlation  between 
male  sex  and  science  interest;  the  coefficient  obtained  by  Meyer  and 
Penfold  was  0.  61.  Secondly,  Guilford  (1959)  has  claimed  that  the  Kuder 
Preference  Record  scores  should  not  be  correlated  with  scores  such  as 
those  usually  obtained  from  aptitude  or  attainment  tests.  The  problem 
here  is  that,  whereas  in  the  usual  tests  of  ability  and  achievement, 
success  in  each  item  does  not  depend  on  the  answer  given  to  any  other 
item;  in  the  forced-choice  interest  test  the  items  are  so  interconnected 
as  to  make  it  impossible  for  the  subject  to  be  rated  equally  highly  on 
any  of  the  interests  measured.  Such  scores  are  called  ipsative  scores. 

It  could  be  then,  that  an  interest  measure  giving  normative  scores 
would  have  established  this  factor  as  clearly  a  sex  factor  or  an  interest 
factor.  The  third  factor  suggests  that  distinct  abilities  in  the  different 
"sciences"  may  be  important  over  and  above  general  intellectual  ability. 
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However,  the  loadings  of  the  standardized  tests  in  biology  and  chemistry 
were  low  on  this  factor,  and  it  may  be  that  the  factor  is  determined  by 
something  present  in  the  school  situations,  such  as  attitudes,  which  is 
not  reflected  in  the  standardized  tests. 

A  factorial  investigation  of  the  Iowa  Tests  of  Educational 
Development  (ITED),  carried  out  by  Cassell  and  Stancik  (I960),  is 
relevant.  They  included  eight  ITED  tests  along  with  four  tests  from 
the  Differential  Aptitude  Battery  (DAT);  as  well  as  the  Cooperative 
Reading  Comprehension  test  and  the  California  Test  of  Mental  Maturity 
in  their  battery.  These  were  administered  to  124  high  school  students 
in  Phoenix  in  the  United  States.  They  obtained  six  factors  including 
what  is  called  a  "Natural  Science"  factor.  This  factor  was  determined 
by  a  loading  of  only  0.  41  from  the  ITED  test  "I  nterpretation  of  Natural 
Science,  "  and  the  test  "Basic  Natural  Science"  had  a  loading  of  -0.  26 
on  the  factor.  It  seems  unsatisfactory  to  describe  this  factor  as  a 
"Natural  Science"  factor.  A  re-rotation  of  the  centroid  factor  loadings 
from  the  analysis  has  been  carried  out  during  the  present  investigation 
using  the  varimax  analytical  method.  As  a  result  of  this  rotation  one 
factor  was  obtained  which  had,  as  its  major  contribution,  a  loading  of 
0.  67  from  "Basic  Natural  Science.  "  "Interpretation  of  Natural  Science,  " 
however,  had  a  very  low  loading  on  this  factor.  Its  major  contribution 
was  to  what  seemed  to  be  a  reading  ability  factor  upon  which  it  had  a 
loading  of  0.81.  Interpretation  of  the  various  factors  in  the  analysis 
as  performed  by  the  authors,  or  after  the  varimax  rotation  is  not 
straightforward.  Presumably  Cassell  and  Stancik  included  the  DAT 
tests  to  help  identify  the  factors.  Vernon  (1961)  pointed  out  that  these 
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tests  were  designed  with  a  view  to  the  DAT  battery  as  a  whole  having 
useful  predictive  value  for  future  achievement  and  no  claim  for  factorial 
purity  is  made.  This  would  help  to  explain  the  complex  nature  of  the 
analysis.  The  study  points  to  the  need  for  good  reference  tests  in  such 
an  investigation. 

Vernon  (1961)  had  predicted  that  the  scientific  and  mathematical 
subjects  should  yield  their  own  ability  group  factors.  J.  Wrigley  (1958) 
found  evidence  for  a  mathematical  group  factor,  though  general  intel¬ 
lectual  ability  seemed  to  be  more  important  in  attainment  in  mathe¬ 
matics.  The  only  evidence  of  a  special  scientific  group  factor  is  that 
provided  recently  by  Lewis  (1961)  whose  work  has  been  mentioned 
already.  Success  in  science  seems  to  be  predominantly  dependent  upon 
general  intellectual  ability.  The  works  of  Roe  (1956  and  I960)  and 
Brandwein  (I960)  are  relevant  in  this  regard.  They  have  made  extensive 
surveys  of  intellectual  and  other  personality  characteristics  of  scientists. 
The  findings  indicate  that  successful  scientists  usually  have  high  intelli¬ 
gence,  but  there  has  not  emerged  from  these  investigations  an  ability 
which  marks  scientists  off  from  other  groups  of  people,  over  and  above 
general  intellectual  ability.  Personality  attributes  such  as  independence, 
curiosity,  and  persistence  are  regarded  as  highly  significant.. 

Summary 

The  evidence  indicates  that  recall  of  knowledge,  especially  of 
the  meaningful  and  long  term  variety,  is  related  to  general  intellectual 
ability,  rather  than  to  some  special  capacity  for  memory.  In  factor 
analysis  meaningful  memory  has  been  found  to  relate  to  a  general  intel¬ 
ligence  factor  or  to  a  verbal  factor  when  this  has  been  obtained  in  the 
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analysis. 

Discussions  amongst  scientists,  philosophers,  and  educators 
have  led  to  the  replacement  of  "skill  in  the  use  of  the  scientific  method" 
as  an  objective  of  science  teaching.  Its  place  has  been  taken  by  "the 
development  of  problem-solving  ability.  "  The  work  of  scientists  is 
not  characterized  by  one  given  pattern  of  reasoning;  it  involves  logical 
reasoning,  but  creative  activity  is  not  explainable  solely  by  this. 
Science  educators  today  conceive  of  the  objective  of  skill  in  problem¬ 
solving  ability  as  involving  both  reasoning  ability  and  creative  activity. 
The  investigations  of  psychologists  have  led  to  the  conviction  that 
problem-solving  ability  is  related  to  personality  variables  such  as 
flexibility  and  confidence,  in  addition  to'  intelligence  as  measured  by 
present  tests.  There  is  evidence  that  in  the  course  of  problem-solving 
activity,  definite  phases  can  be  identified,  such  as  analysis  of  the 
difficulty,  production  and  verification,  and  it  may  be  that  skill  in 
problem-solving  activity  can  be  developed  by  training  related  to  these 
phases'.  The  tests  used  to  measure  creativity  are  of  a  different  nature 
from  those  customarily  used  in  measuring  abilities.  Evidence  con¬ 
cerning  the  relation  of  the  abilities  sampled  by  these  tests  to  those 
sampled  by  tests  of  aptitude  and  achievement  is  still  fragmentary. 

The  STEP  Science  tests  have  been  designed  as  measures  of  problem¬ 
solving  ability. 

■  -  Science  educators  are  currently  placing  a  good  deal  of  emphasis 
upon  the  development  of  an  understanding  of  science  as  an  objective. 
Increased  understandings  of  the  nature  of  scientific  work,  of  the 
dynamic  quality  of  scientific  knowledge,  of  the  scientist,  and  of  the 
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relationships  between  science  and  society  are  desirable  in  the  present 
age.  Attempts  to  measure  these  understandings  are  beginning,  but  the 
tests  are  still  in  the  experimental  stage.  There  is  little  evidence  about 
the  relationship  of  the  variables  measured  by  such  tests  to  commonly 
usedaptitude  and  achievement  measures,  and  to  personality  variables 
such  as  attitudes. 

Science  educators  have  regarded  the  development  of  a  scientific 
attitude  as  a  desirable  outcome  of  the  experiences  of  students  in  science 
courses.  Investigators  have  concluded  that  such  an  attitude  includes 
elements  such  as  open-mindedness,  carefulness  and  accuracy,  belief 
in  cause  and  effect  relationships,  curiosity,  and  suspended  judgment. 
Evaluation  has  often  involved  the  use  of  one  instrument  to  measure  the 
scientific  attitude,  but  the  validity  of  such  tests  have  been  questioned, 
and  a  more  profitable  procedure  probably  would  be  to  measure  the 
component  attitudes  separately. 

Factor  analytic  studies  in  science  have  indicated  that  success 
in  science  depends  on  general  intellectual  ability,  and  there  has  been 
little  evidence  so  far  of  the  existence  of  an  unique  scientific  ability, 
though  this  has  been  predicted. 
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CHAPTER  III 


THEORETICAL  FRAMEWORK 

The  purpose  of  this  chapter  is  to  provide  the  theoretical  frame¬ 
work  for  the  investigation.  The  main  numerical  technique  involved, 
namely  factor  analysis,  is  discussed;  certain  implications  are  drawn 
from  the  review  of  literature  in  Chapter  II  leading  to  the  statement  of 
some  postulates.  Terms  are  defined,  and  the  study  is  given  direction 
by  the  statement  of  hypotheses  to  be  tested. 

Factor  Analysis 

If  two  tests  are  administered  to  a  group  of  individuals,  and  the 
resulting  scores  indicate  that  there  is  a  high  correlation  between  the 
tests,  it  may  be  assumed  that  the  tests  are  measuring  something  in 
common,  that  is,  there  is  an  underlying  common  factor  or  hypothetical 
variable  common  to  both.  Positive  correlations  between  a  number  of 
tests  indicate  that  one  or  more  common  factors  are  operating  in  the 
domain.  Factor  analysis  provides  a  means  for  finding  out  information 
about  these  hypothetical  variables. 

The  variance  of  a  set  of  obtained  scores  for  a  given  test  is 
composed  of  true  variance,  arising  from  the  individual  differences  of 
the  subjects,  and  error  variance,  resulting  from  errors  of  measure¬ 
ment.  The  reliability  of  the  test  is  a  function  of  the  true  variance. 
When  there  is  correlation  between  a  number  of  measures  some  of  the 
true  variance  will  be  common  to  two  or  more  tests,  and  some  of  it 
will  be  specific  to  each.  In  factor  analysis  it  is  the  common  variance 
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which  is  of  special  interest.  If  a  number  of  hypothetical  variables,  the 
common  factors,  less  in  number  than  the  tests,  can  be  found  to  express 
the  common  variance  and  thus  the  common  part  of  the  test  scores,  then 
parsimony  has  been  achieved,  and  this  is  one  of  the  desirable  outcomes 
of  scientific  enquiry.  If,  in  addition,  meaning  can  be  given  to  these 
common  factors,  then  an  ordering  of  thinking  about  the  domain  under 
investigation,  has  been  achieved,  and  perhaps  new  light  can  be  thrown 
upon  it,  for  example,  new  understanding  of  what  the  tests  are  measur¬ 
ing,  and  of  relationships  between  them. 

Factor  analysis  is  based  upon  a  linear  mathematical  model. 

The  score  zjj,  for  an  individual  i  on  a  test  j,  is  represented  as  a  linear 
combination  of  hypothetical  variables. 

zji  =  ajl-^li  +  ajZ-^2i  +•  •  •  +  ajm^mi  +  aj^ji 

where 

zji  is  the  score  of  person  i  on  test  j  in  standard  form, 

(with  mean  O,  and  standard  deviationl.  ) 

a-  (p  =  1 ,  2,  .  .  .  ,  m)  is  the  factor  loading,  and  expresses  the 
JP 

correlation  between  test  j  and  factor  p 

Fnj  (p  =  1  ,  2,  .  .  .  ,  m)  is  the  factor  score,  and  describes  the 
P  1 

individual  i  in  relation  to  the  factor  p 

Ujj  (i  =  1 ,  2,  .  .  .  ,  N)  (j  =  1 ,  2,  .  .  .  ,  n)  are  the  unique  factors; 
they  describe  the  person  i  in  relation  to  the  unique  part  of 
the  test  j 

aj  (j  =  1,  2,  .  .  .  ,  n)  is  a  factor  loading  expressing  the  correla¬ 
tion  between  test  j  and  the  unique  factor  corresponding  to 
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For  a  given  person  there  are  n  values  of  zji,  one  for  each  of  n  tests, 
and  there  are  N  values  of  i,  one  for  each  of  N  individuals. 

In  order  to  gain  information  about  the  tests,  the  problem 
consists  in  finding  the  values  of  the  coefficients  ajp  for  the  common 
factors.  In  the  case  of  orthogonal  common  factors  it  can  be  shown 
that  the  factor  loadings  are  related  to  the  correlation  coefficients 
for  the  tests  by  the  following  relation: 

rjk  “  ajlakl  +  aj2ak2  +•  •  •  +  ajmakm  +  ajak 

where 

j,  (j  =  1 ,  2,  .  .  .  ,  n)  and  k,  (k  =  1 ,  2,  .  .  .  ,  n)  refer  to  tests, 
rj^  is  the  correlation  coefficient  between  tests  j  and  k.  When 
n  tests  are  considered,  a  symmetric  matrix  of  order  (n  x  n),  of 
(n^  -  n)  inter  cor  relation  coefficients  is  obtained,  not  including  the 
diagonal  values.  If  R  is  defined  as  containing  estimates  of  each 
test's  common  variance  ( h j ^ )  in  the  diagonal,  where  1  -  uj^  =  hj^ 

(uj^  being  the  unique  variance  of  test  j  ),  then  given  Z  =  AF,  which 
is  the  linear  mathematical  model  in  matrix  form,  and  orthogonal 
factors  it  can  be  shown  that 

R  +  U2  =  AAT 

where 

is  a  diagonal  matrix  of  order  (n  x  n)  containing  the  u  j  . 

is  the  transpose  of  the  matrix  A. 

F^  is  the  transpose  of  the  matrix  F. 

There  are  various  methods  by  means  of  which  the  matrix  of  factor 

2 

loadings  can  be  determined  starting  from  the  matrix  R  or  (R  +  U  ), 
for  example  the  principal-factor  technique  outlined  in  Chapter  VI 
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and  the  centroid  method. 

When  the  object  is  to  express  R  as  a  function  of  m  common 
factors  each  diagonal  value  must  be  equal  to  the  sum  of  the  squares 
of  the  factor  loadings  for  the  test  concerned.  This  sum  is  called 
the  "communality,  "  being  a  measure  of  the  common  variance  involved 
in  that  test.  Thus-- 

communality  of  test  j  =  hj^  =  ajl^  +  ajZ^  +•  .  .  +  ajm^* 

In  order  to  find  the  factor  loadings  of  R  some  estimates  of  the 
communalities  are  initially  used  in  the  diagonal  cells. 

The  solution,  that  is  the  matrix  A,  requires  a  set  of  m  refer¬ 
ence  axes.  The  common  factors  may  be  orthogonal  or  oblique.  In  the 
former  case  the  reference  axes  are  perpendicular  to  one  another.  Each 
test  is  represented  by  a  point  in  this  common  factor  space,  or  by  a  line 
joining  the  origin  to  this  point  and  calledthe  testvector.  The  correlations 
between  the  tests  are  determined  by  the  angles  between  the  test  vectors, 
and  the  loadings  of  the  tests  on  the  factor  s  are  given  by  the  perpendicular 
projections  of  the  test  vectors  on  the  reference  axes.  So  long  as  the 
test  vectors  are  fixed,  the  relationships  between  the  tests  are  not  altered 
by  rotating  the  reference  axes  about  the  origin.  For  every  position  of 
these  axes  there  would  be  a  different  set  of  factor  loadings.  It  can  be 
seen  that  there  could  be  an  infinite  number  of  solutions  corresponding 
to  all  possible  placements  of  the  reference  axes.  The  particular  method 
of  analysis  places  the  axes  in  a  certain  position,  but  this  may  not  have 
special  significance  in  terms  of  psychological  meaning.  The  task  of  the 
investigator  is  then  to  rotate  the  reference  axes  in  search  of  a  solution 
which  is  meaningful  in  terms  of  his  knowledge  of  the  domain  under  study. 
It  should  be  noted  that  each  solution  is  as  good  as  any  other  mathematic¬ 
ally;  the  validity  of  the  one  chosen  must  be  argued  on  other  grounds. 
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Rotation  can  be  carried  out  by  various  graphical  methods,  and  there 
are  now  several  different  analytical  methods  employing  computer 
programs.  Some  of  the  latter  methods  give  solutions  which  have  been 
found  empirically  to  be  in  good  agreement  with  solutions  which  have 
been  obtained  by  the  graphical  methods  of  skilled  investigators. 

Interpretation  and  the  Nature  of  Factors  in  Factor  Analysis 

In  the  domain  of  abilities  the  process  of  interpretation  has 
been  strongly  influenced  by  two  different  schools  of  thought. 

Spearman  found  that  in  some  tests  of  abilities  one  general  factor 
common  to  all,  and  specific  factors  were  sufficient  to  explain  nearly 
all  the  variance.  It  was  later  found  that  in  most  practical  cases  some 
group  factors  common  to  a  group  of  tests,  were  also  necessary.  How¬ 
ever,  Spearman’s  concept  of  a  general  factor  for  intelligence,  which 
he  called  "g,  "  has  influenced  one  school  of  thought  in  trying  to  find  a 
solution  which  gives  one  general  factor.  The  factor  pattern  is  deter¬ 
mined  by  rotation  so  as  to  obtain  such  a  general  factor. 

Thur stone  showed  that  it  was  possible  to  provide  a  solution 
solely  in  terms  of  group  factors  to  which  he  could  give  psychological 
meaning.  This  approach  is  called  multiple -factor  analysis.  He  put 
forward  a  principle  of  "simple  structure"  as  a  guide  to  reducing  the 
complexity  of  variables.  To  achieve  simplicity  he  sought  to  raise 
the  high  loadings  as  close  to  unity  as  possible  and  reduce  the  low 
loadings  as  close  to  zero  as  possible,  over  the  factor  pattern. 
Thurstone  set  out  criteria  to  be  met  for  "simple  structure"  in  an 
attempt  to  achieve  objectivity  in  the  process  of  rotation.  In  regard 
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to  the  question  of  intelligence,  Thurstone  (1938)  opposed  the  general 
intelligence  concept  by  postulating  that  intelligence  was  best  described 
in  terms  of  group  factors,  which  he  called  "Primary  Mental  Abilities.  " 
Thurstone  (1947)  showed  that  second  order  factors  could  be  obtained 
from  oblique  multiple -factor  solutions,  and  Rimoldi  (1951)  used  this 
technique  to  demonstrate  that  a  second  order  factor  from  a  multiple- 
factor  analysis  of  reasoning  tests  was  very  similar  to  Spearman's  g. 
Thus  it  seems  that  the  two  solutions  are  reconcilable.  It  might  be 
noted  too,  that  the  low  intercor relations  which  Thurstone  originally 
found  between  ability  tests  were  partly  due  to  the  homogeneity  of 
his  sample  made  up  of  University  of  Chicago  students.  The  contro¬ 
versy  concerning  general  intelligence  or  primary  abilities  has 
abated  somewhat,  but  the  two  different  approaches  are  still  used. 

The  English  factorists  tend  to  favour  a  general  plus  group  factor 
solution,  giving  a  hierarchical  structure,  whereas  in  the  United 
States  the  multiple -factor  group  method  is  more  popular. 

The  interpretation  of  a  group  factor  is  determined  by  the 
tests  which  have  loadings  on  that  factor.  If  some  of  these  tests 
involve  simple  content  as  opposed  to  complex,  then  the  task  of 
interpretation  is  easier.  For  example,  if  a  factor  has  one  of  its 
highest  loadings  on  a  test  of  numerical  computation,  then  the 
factor  is  likely  to  be  called  a  numerical  factor,  N.  The  other  tests 
with  loadings  on  this  factor  involve  the  same  ability  to  some  extent. 
Individual  differences  in  performance  on  these  other  tests  are  to 
some  extent  dependent  upon  this  ability.  Numerical,  verbal,  and 
spatial  factors  are  some  of  those  commonly  found  and  accepted  in 
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the  realm  of  abilities.  As  has  been  noted  in  Chapter  II,  there  is  no 
common  body  of  agreement  about  the  number  and  nature  of  reasoning 
factors.  It  is  clear  that  interpretation  is  facilitated  if  some  tests  with 
relatively  simple  content  are  included  in  a  factor  analysis;  such  tests 
are  called  "marker"  or  "reference"  variables.  The  factors  obtained 
in  an  analysis  are  hypothetical  statistical  entities.  They  do  not  neces¬ 
sarily  correspond  to  some  special  neurological  organization,  but  a 
person  can  be  said  to  behave  as  if  some  such  underlying  causal 
mechanism  were  present. 

Some  Considerations  Regarding  the  Design  of  Factorial  Investigations 

Certain  points  have  emerged  from  the  discussion  of  factor 
analysis  thus  far  which  have  implications  for  the  design  of  a  factorial 
investigation.  For  example,  it  is  important  to  include  in  a  battery  of 
tests  to  be  analyzed  marker  or  reference  tests  which  will  assist  in 
the  interpretation  of  the  results. 

Vernon  (1961)  has  suggested  that  it  would  be  much  easier  to 
compare  factorial  studies  in  the  realm  of  abilities  if  investigators 
could  agree  much  more  about  the  kinds  of  reference  variables  to  include. 
He  recommends  that  "every  factorial  investigation  should  include  in  its 
battery  sufficient  tests  (preferably  agreed  standard  ones)  to  give  good 
all-round  measures  of  V,  N,  S,  and  I,  very  much  as  defined  by  Thur- 
stone,  or  alternatively  of  British  g  +  v  :  ed  +  k  :  m.  "  (1961,  p.  171). 

In  many  investigations  this  would  seem  to  be  useful.  However,  investi¬ 
gations  like  those  of  Guilford  into  the  finer  structure  of  abilities  are 
also  important.  There  is  increasing  emphasis  upon  the  education  of 
the  gifted,  and  in  working  with  highly  selected  groups  some  of  the 
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narrower  group  factors  emerging  from  Guilford's  work  may  well  be 
very  significant.  In  such  exploratory  studies  reference  variables  such 
as  those  suggested  by  Vernon  would  not  be  helpful. 

Both  Vernon  (1961)  and  Guilford  (1954)  recommend  the  inclu¬ 
sion  in  a  battery  of  three  tests  which  are  believed  to  be  measures  of  a 
group  factor  which  is  hypothesized.  This  suggestion  is  made  in  the 
interests  of  providing  a  fair  trial  for  hypotheses  which  are  set  up. 

The  theory  of  factor  analysis  involves  the  consideration  of 
the  group  of  test  vectors  in  a  space  having  one  dimension  for  each 
person.  Thus  the  minimum  size  of  a  sample  of  persons  would  be  a 
number  equal  to  the  number  of  tests  to  be  investigated.  However,  if 
some  stability  is  to  be  achieved  in  the  factor  structure  so  that  conclu¬ 
sions  can  be  generalized  to  a  large  population  such  as  a  school  or 
school  district,  then  larger  samples  are  needed.  Guilford  (1954) 
suggests,  as  a  guide,  that  the  sample  be  200  or  more. 

Some  Implications  for  the  Investigation  from  the  Literature  Review 

It  can  be  concluded  from  past  research  that  the  kind  of  memory 
which  is  important  in  achievement  in  science  is  related  to  general 
intellectual  ability  rather  than  to  a  special  memory  factor.  Further, 
there  is  evidence  that  in  factor  analytic  studies  where  a  verbal  factor 
is  obtained  tests  of  meaningful  memory  load  on  that  factor.  It  seems 
reasonable  to  believe  that  in  an  analysis  yielding  reasoning  and  verbal 
factors  a  test  of  knowledge  involving  simple  recall  type  items  would  have 
high  loadings  on  the  verbal  factor,  and  probably  small  or  zero  loadings 
on  the  reasoning  factor.  If  this  is  true  the  contention  that  school 
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examinations  measure  mainly  factual  knowledge  could  be  tested  by  find¬ 
ing  out  whether  or  not  these  examinations  correlate  highly  with  a  test 
of  simple  recall,  and  load  highly  on  the  verbal  factor,  but  have  small 
loadings  or  zero  loadings  on  a  reasoning  factor. 

Problem-solving  abilities,  as  an  objective  of  science  teaching, 
involve  creative  activity  as  well  as  logical  reasoning.  These  two  func¬ 
tions  of  the  human  mind  are  related  to  one  another,  but  it  is  helpful  to 
consider  them  separately,  as  well  as  in  conjuction.  Certain  kinds  of 
tests  measure  logical  reasoning  satisfactorily,  but  an  increasing  interest 
in  creativity  is  leading  to  the  design  of  otherkindsof  tests  to  measure 
variables  not  sampled  by  such  reasoning  tests.  Wilson  and  Guilford 
(1954)  claimed  that  to  measure  creative  abilities,  something  needs  to 
be  produced  by  the  examinee,  and  they  have  used  mainly  completion 
tests  in  their  investigations.  Some  other  workers,  for  example  Getzels 
and  Jackson  (1962),  have  been  following  this  lead.  An  investigation  of 
tests  and  test  items  in  science  which  the  literature  suggests  as  measures 
of  problem-solving  ability  or  critical  thinking  shows  that  they  are  of  the 
multiple  choice  type  or  some  variation  of  this.  It  is  likely  that  the  most 
which  could  be  claimed  for  such  tests  is  that  they  measure  the  reasoning 
aspects  of  problem-solving.  If,  in  fact,  they  do  this  well,  they  would  be 
very  useful  instruments,  but  evidence  to  support  this  does  not  seem  to  be 
stronger  than  evidence  that  school  examinations,  involving  a  variety  of 
item  types,  typically  measure  such  reasoning  aspects.  Wilson  and 
Guilford,  in  the  reference  just  mentioned,  noted  that  many  of  their  tests 
of  creativity  had  low  reliabilities,  some  being  the  region  of  0.4.  Vernon 
(19^61)  has  claimed  that  it  is  almost  impossible  to  measure  creativity 
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reliably.  In  view  of  these  considerations  it  was  decided  to  limit  the 
present  study,  so  far  as  problem-solving  tests  were  concerned,  to  an 
investigation  of  the  extent  to  which  they  measure  reasoning  ability. 

(The  work  of  Getzels  and  Jackson  (1962)  published  after  this  investiga¬ 
tion  was  begun,  indicated  that  the  tests  used  by  them  have  quite  high 
reliabilities,  values  of  the  order  0.  8  to  0.  9  being  reported.  ) 

Accordingly,  a  criterion  for  the  validity  of  the  problem-solving 
tests  in  this  study  was  set  up,  namely  that  they  should  show  high  loadings 
on  a  reasoning  factor  in  an  analysis  of  variables  including  these  tests 
and  certain  reasoning  marker  tests.  Whether  or  not  narrowly  defined 
reasoning  abilities,  such  as  an  ability  to  draw  conclusions  or  an  ability 
to  define  problems,  are  significa.nt  in  determining  individual  differences 
in  performance  of  grade  X  students  is  a  question  which  is  open  to 
investigation.  An  attempt  was  made  to  ascertain  whether  or  not  the  STEP 
Science  test  measures  these  abilities,  as  its  publishers  claim  that  it  does. 

Understanding  of  science,  as  an  objective,  seems  to  involve 
knowledge,  and  the  ability  to  reason  with  this  knowledge  in  regard  to 
science  as  a  human  enterprise  and  about  scientists.  In  view  of  the  fact 
that  tests  of  knowledge  have  been  used  as  measures  of  attitudes  in  some 
fields  where  there  is  likely  to  be  a  strong  affective  component,  it  is  also 
probable  that  tests  of  understanding  science  would  be  related  to  some 
attitude  such  as  attitudes  to  science  and  scientists.  The  claim  for  list¬ 
ing  understanding  science  as  a  separate  objective  would  seem  to  be  that 
the  subject  -matter  involved  is  sufficiently  important  to  cut  across 
specific  disciplines,  be  they  physics,  chemistry,  biology,  or  even 
groupings  such  as  general  science. 
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As  reviewed  in  Chapter  II,  the  claims  of  investigators  about 
the  components  of  a  scientific  attitude  are  not  in  complete  agreement. 

A  list  made  up  from  their  reports  would  include  curiosity,  belief  in 
cause  and  effect  relationships,  suspended  judgment,  the  habit  of  weigh¬ 
ing  evidence,  open-mindedness,  accuracy  and  carefulness,  impartiality, 
freedom  from  prejudice  and  superstition,  intellectual  honesty,  and 
critical  mindedness.  It  seems  that  there  is  some  overlap  between 
some  of  these  elements,  and  that  the  list  could  be  shortened.  The 
following  six  components  are  suggested  as  representing  those  just 
listed  without  any  significant  loss:  curiosity,  belief  in  cause  and  effect 
relationships,  open-mindedness,  accuracy  and  carefulness,  intellectual 
honesty,  and  critical  mindedness.  Because  of  the  limitations  of  tests 
available,  the  time  involved  in  constructing  such  tests,  and  testing 
time,  it  was  decided  by  way  of  exploration  in  this  area  to  use  tests  of 
three  of  these  components,  open-mindedness,  curiosity,  and  belief  in 
cause  and  effect  relationships.  A  test  of  superstitious  beliefs  was 
adopted  as  a  measure  of  belief  in  cause  and  effect  relationships.  No 
tests  of  curiosity  were  available.  It  was  believed  that  the  attitude  of 
subjects  to  investigation  and  discovery  of  knowledge  was  likely  to  be 
related  to  curiosity,  and  a  test  to  measure  this  attitude  was  constructed. 
The  approach  taken  here  to  measuring  scientific  attitude  is  in  line  with 
the  contention  made  in  Chapter  II  that  it  is  more  defensible  to  attempt 
to  measure  the  components  of  scientific  attitude  separately  than  by  one 
test.  Whether  one  speaks  of  a  scientific  attitude,  or  of  scientific  atti¬ 
tudes  seems  to  be  somewhat  arbitrary.  An  exploratory  aspect  of  the 
present  investigation  was  to  see  whether  or  not  the  three  tests  used  had 
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sufficient  common  ground  to  indicate  that  one  scientific  attitude  is  a 
possibility.,  The  existence  of  such  an  attitude  was  set  up  as  an  hypo¬ 
thesis  to  be  tested.  In  line  with  this  thinking,  a  criterion  for  the 
validity  of  the  attitude  measures  was  framed  as  follows:  The  attitude 
tests  should  show  high  loadings  on  separate  attitude  factors,  or  on  a 
common  attitude  factor  if  such  exists,  and  low  or  zero  loadings  on 
ability  factors.  It  does  not  follow  that  meeting  this  criterion  establishes 
their  validity  beyond  doubt,  but  it  does  provide  some  evidence;-  It  may 
be  regarded  as  necessary,  but  not  sufficient  on  its  own. 

Some  further  comments  about  scientific  attitude  need  to  be 
made  at  this  point.  The  concept  is  neither  that  of  an  attitude  or  attitudes 
which  are  only  found  amongst  scientists,  nor  of  attitudes  only  found 
among  individuals  working  in  the  laboratory.  The  attitudes  in  question 
have  been  evident  in  scientists  engaged  in  their  work,  and  it  has  been 
believed  that  the  existence  of  such  attitudes  has  been  important  to  the 
success  of  scientists.  The  nature  of  his  discipline,  in  general,  enforces 
such  attitudes  upon  the  scientist.  Successful  scientists'have  not  always 
consistently  displayed  such  attitudes  in  life  outside  their  laboratories. 
However,  if  the  attitudes  have  contributed  significantly  to  the  solution 
of  scientific  problems  it  can  be  argued  that  they  would  be  a  help  in 
many  aspects  of  daily  living.  The  learning  of  such  attitudes  is  not 
confined  to  experience  in  science  courses,  but  these  courses  can  pro¬ 
vide  a  very  suitable  context,  and  it  is  desirable  that  science  teaching 
should  contribute  to  the  development,  of  these  attitudes. 

From  the  review  of  the  literature  one  can  conclude  that  success 
in  science  depends  upon  general  intellectual  ability.  If  there  is  some 
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further  ability  involved,  it  is  likely  that  this  would  not  be  specific  to 
different  sciences  such  as  a  physics  or  a  biology  ability,  at  least  in  a 
sample  of  high  school  students  having  a  considerable  range  of  general 
intellectual  ability.  If  this  is  so  the  different  sciences  would  not  deter¬ 
mine  the  factor  structure  in  an  analysis  involving  several  of  them. 

This  is 'the  position  which  has  been  taken  here,  and  an  analysis  was 
carried  out  to  throw  light  upon  this.  The  allocation  of  items  of  the 
STEP  Science  test  to  different  sciences  given  in  the  teacher's  guide 
was  used  as  a  basis  for  forming  subtests  for  biology,  chemistry, 
physics,  and  meteorology  and  these,  along  with  school  science  marks 
for  grade  IX  and  grade  X,  were  factor  analyzed.  If  one  ability  is  the 
major  contributor  a  general  factor  should  account  for  a  large  percent¬ 
age  of  the  common  variance. 

The  considerations  outlined  in  this  section  of  the  chapter 
indicate  some  basic  positions  from  which  the  investigation  proceeded. 
They  may  be  summarized  in  the  postulates  which  follow. 

Postulates 

1.  A  test  of  factual  knowledge  requiring  simple  recall  shows  high 
loading  on  a  verbal  factor  but  small  or  zero  loadings  on  a  reason¬ 
ing  factor  in  a  factor  analysis  yielding  these  factors. 

2.  A  broad  reasoning  factor  can  be  obtained  in  a  factor  analysis  of 
tests  of  abilities  by  the  inclusion  of  marker  tests  for  induction, 
deduction,  and  general  reasoning. 

3.  Problem-solving  ability  as  an  objective  of  science  teaching  involves 
creative  activity  as  well  as  logical  reasoning. 

4.  Problem-solving  tests  of  the  multiple  choice  variety  involve 
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reasoning  ability,  and  a  given  test  should  satisfy  the  criterion  that 
it  shows  high  loading  on  a  broad  reasoning  factor  in  a  factor  analysis 
where  such  a  factor  is  obtained. 

5.  A  necessary  condition  for  the  validity  of  tests  of  scientific  attitudes 
is  that  they  have  high  loadings  on  separate  attitude  factors  or  on  a 
general  scientific  attitude  factor  if  this  can  be  obtained. 

6.  At  the  high  school  level,  performance  on  tests  which  include  items 
from  different  sciences  such  as  biology,  chemistry,  and  physics, 
is  determined  by  general  intellectual  ability  rather  than  abilities 
specific  to  the  different  sciences.  When  a  factor  analytic  study 
involves  such  tests,  the  factor  structure  is  not  determined  by 
abilities  specific  to  the  different  sciences. 

Definitions 

These  definitions  are  to  be  regarded  as  working  definitions. 
They  are  taken  from  the  literature  or  are  based  on  the  literature  with 
a  view  to  relating  tests  to  the  objectives  involved. 

1.  "Science.  "  The  definition  suggested  by  Conant  (1951)  is  adopted, 
namely,  "science  is  an  interconnected  series  of  concepts  and  conceptual 
schemes  that  have  developed  as  a  result  of  experimentation  and  observa¬ 
tion  and  are  fruitful  of  further  experimentation  and  observations.  " 

(1951,  p.  25).  It  is  both  a  body  of  knowledge  and  a  human  activity. 

2.  "Knowledge"  as  an  objective  is  defined  according  to  Bloom's 
Taxonomy  of  Educational  Objectives  (1956).  It  involves  "the  recall 
of  specifics  and  universals,  the  recall  of  methods  and  processes,  or 
the  recall  of  a  pattern,  structure,  or  setting.  For  measurement 
purposes,  the  recall  situation  involves  little  more  than  bringing  to 
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mind  the  appropriate  material.  "  (1956,  p.  201). 

3.  "Problem.  "  For  discussion  of  problem-solving,  the  following 
definition  of  a  problem  is  used,  and  is  based  on  a  definition  given  by 
Thorndike  (1950).  A  problem  is  a  situation  in  which  the  individual  is 
motivated  to  reach  a  particular  objective,  but  progress  is  blocked 
because  available,  habitual  response  patterns  and  knowledge  are  not 
adequate,  without  modification,  to  enable  him  to  proceed  to  the 
objective. 

4.  "Reasoning.  "  Reasoning  ability  is  operationally  defined  as  the 
ability  or  abilities  measured  by  the  group  factor  determined  by  the 
tests  of  induction,  deduction,  and  general  reasoning  included  in  the 
battery. 

5.  "Understanding  Science"  as  an  objective  of  science  teaching  is 
defined  as  knowledge  about  science  (for  example,  its  dynamic  quality), 
scientists  (for  example,  the  attitude  of  the  scientist  to  his  work),  and 
relationships  between  science  and  technology,  and  science  and  society. 

6.  "Attitude"  is  defined  in  the  terms  of  Vernon  (1957)  as  "a  person¬ 
ality  disposition  or  drive  which  determines  behaviour  towards,  or 
opinions  and  beliefs  about,  a  certain  typ  e  of  person,  object,  situation, 
institution,  or  concept.  "  (1957,  p.  144). 

7.  "Scientific  Attitude"  is  defined  as  the  one  or  more  attitudes 
involved  in  curiosity,  open-mindedness,  belief  in  cause  and  effect 
relationships,  accuracy  and  carefulness,  intellectual  honesty,  and 
critical  mindedness,  as  these  are  exhibited  by  scientists  at  work. 

Hypotheses 


The  purposes  of  this  study  are  to  seek  evidence  about  the 
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validity  of  certain  tests  and  to  explore  the  processes  involved  in  these 
and  other  tests  and  the  relationships  between  tests  of  the  four  object¬ 
ives.  The  following  hypotheses  were  set  \ip  to  direct  the  investigation: 
Hypotheses  related  to  the  validity  of  tests: 

1.  Certain  tests  of  problem-solving  ability  in  science  measure  ability 
to  reason  with  the  knowledge  which  they  require  to  be  recalled.  To 
test  this  hypothesis  the  following  prediction  was  made; 

The  STEP  Science  Test  and  the  Test  of  Application  of  Scienti¬ 
fic  Knowledge  will  show  high  loadings  on  a  reasoning  group  factor. 

2.  Certain  tests  of  scientific  attitude  measure  different  attitudes  or 
one  attitude,  but  not  abilities.  It  was  predicted  that  the- tests  of 
scientific  attitude  would  have  high  loadings  on  one  or  more  attitude 
factors  and  low  or  negligible  loadings  on  other  factors. 

3.  Examinations  in  science  used  by  the  Alberta  Department  of  Educa¬ 
tion  and  Edmonton  schools  at  the  grade  IX  and  grade  X  levels  respect¬ 
ively,  measure  knowledge  primarily.  They  measure  problem-solving 
ability  tO' a  small  extent  and  do  not  measure  scientific  attitudes.  The 
tests  of  this  hypothesis  were  provided  by  the  following  deductions: 

a.  Section  A  of  the  Alberta  Department  of  Education, 
Departmental  Examination,  Grade  IX,  Science ,  (1961),  will  show 
high  loading  on  a  verbal  group  factor,  and  a  negligible  loading  on  a 
reasoning  group  factor. 

b.  Section  B  of  the  same  examination  will  correlate  highly 
with  Section  A  and  will  show  high  loading  on  a  verbal  group  factor, 
and  a  low  loading  on  a  reasoning  factor. 

c.  Sections  A  and  B  of  the  same  examination  will  show 
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negligible  loadings  on  attitude  factors  as  will  Science  X  examinatio.ns. 

d.  The  instruments  used  to  measure  performance  in  Science  X 
in  Edmonton  high  schools  will  show  high  loading  on  a  verbal  factor  and 
a  low  loading  on  a  reasoning  factor. 

Hypotheses  related  to  exploratory  aspects  of  the  study: 

4.  At  the  grade  X  level,  there  is  no  unique  problem-solving  ability 
measured  by  the  tests  in  this  battery  and  selected  as  measures  of  the 
problem-solving  objective.  The  tests  involve  mainly  verbal  and 
reasoning  factors.  The  test  of  this  hypothesis  is  made  in  terms  of 
the  following  prediction: 

A  group  factor  having  high  loadings  on  Parts  I  and  II  of  the 
STEP  Science  Test,  and  on  the  Test  of  Application  of  Scientific  Know¬ 
ledge]  bift  low  loadings  on  other  tests,  will  not  be  obtained.- 

5.  Whilst  the  STEP  Science  test  measures  a  broad  reasoning  ability, 
it  does  not  measure  reliably,  "ability  to  identify  and  define  scientific 
problems,  "  "ability  to  suggest  and  screen  hypotheses,  "  "ability  to 
select  valid  procedures,  "  "ability  to  interpret  data  and  draw  conclu¬ 
sions,  "  "ability  to  evaluate  critically  claims  or  statements  made  by 
others,  "  and  "ability  to  reason  quantitatively  and  symbolically.  "  To 
test  this  hypothesis  an  analysis  of  subtests  of  STEP  Science  was 
carried  out,  based  on  the  allocation  given  in  the  Teacher's  Guide 
(1959),  of  items  to  the  categories  mentioned  in  the  hypothesis.  It 
was  predicted  that  this  analysis  would  yield  a  general  factor  which 
would  account  for  most:  of  the  common  variance,  and  would  not 
yield  group  factors  which  could  be  interpreted  as  the  abilities  men¬ 
tioned  in  the  hypothesis. 

6.  Success  in  science  is  partly  dependent  upon  a  special  scientific 
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ability.  The  following  deduction  was  made  to  test  the  hypothesis: 

The  tests  called  STEP  Science,  A  Test  of  Application  of 
Scientific  Knowledge,  Science  X  and  Grade  IX  Science,  and  A  Test  of 
Understanding  Science,  will  show  intermediate  loadings  upon  a  factor 
upon  which  the  other  variables  will  have  low  or  zero  loadings. 

7.  Tests  of  understanding  science  measure  knowledge,  ability  to 
reason  with  that  knowledge,  and  attitudes.  It  was  predicted  that  the 
Test  of  Understanding  Science  would  show  high  loadings  on  a  verbal 
factor,  and  intermediate  loadings  on  reasoning  and  attitude  factors. 

8.  Tests  of  scientific  attitude  components  overlap  sufficiently  to 
produce  a' common  hypothetical  variable  which  can  be  called  "the 
scientific  attitude.  "  This  was  tested  by  the  prediction  that  the  three 
tests  of  attitudes  would  have  moderate  loadings  on  a  group  factor 
upon  which  other  tests  would  have  negligible  loadings. 


CHAPTER  IV 


DESIGN  OF  THE  INVESTIGATION 
AND  PROCEDURES  USED 

Population  and  Sample 

The  population  used  consisted  of  the  grade  X  students  attending 
classes  in  Science  X  courses  in  the  public  high  schools  of  the  City  of 
Edmonton,  Alberta.  Some  grade  X  students  follow  an  alternative 
Science  XII  course  which  is  less  difficult  and  more  general.  In  the 
interests  of  being  able  to  interpret  the  results  in  relation  to  the  exami¬ 
nations  used  in  the  local  schools,  it  was  decided  to  limit  the  investiga¬ 
tion  to  the  Science  X  course.  Also  because  of  this,  it  was  decided  to 
use  a  sample  which  was  representative  of  the  population.  It  is  unlikely 
that  the  results  would  have  been  significantly  different  if  the  population 
had  included  students  in  the  separate  school  system  in  Edmonton,  and  if 
it  had  included  a  sample  from  rural  schools,  The  mean  scores  for  these 
groups  on  the  tests  may  have  been  different,  but  it  is  unlikely  that  the 
factor  pattern  would  have  been  very  different. 

The  Edmonton  School  Board  advised  that  the  Edmonton  high 
schools  fit  into  three  groups  in  terms  of  the  socioeconomic  status  of  the 
parents.  *  One  high  school  was  chosen  from  each  group.  After  securing 
the  cooperation  of  the  principals  and  staff,  the  heads  of  science  depart¬ 
ments  were  asked  to  nominate  three  classes  so  as  to  give  a  roughly 


^Personal  communication  from  Mr.  M.  J.  V.  Downey,  Person¬ 
nel  Officer,  Educational  and  Office  Personnel,  Edmonton  Public  School 
Board. 
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representative  group  for  each  school.  It  was  expected  that  this  would 
result  in  a  total  sample  of  approximately  300  students  participating  in 
the  testing  program.  This  number  was  based  on  the  claim  of  Guilford 
(1954)  that  the  number  should  be  at  least  200  in  factorial  studies. 

It  was  found  that  there  were  some  grade  XI  students  in  the 
Science  X  classes.  The  papers  of  these  students  were  eliminated 
because  the  Science  IX  examination  was  one  of  those  included  in  the 
investigation,  and  the  grade  XI  students  had  completed  a  different 
version  of  this  from  that  completed  by  the  grade  X  students.  After 
eliminating  these  subjects  and  those  who  were  not  present  for  all  the 
seven  testing  periods  in  this  project,  along  with  those  who  had  recently 
come  to  the  province,  or  for  whom  the  Science  IX  marks  were  not  avail¬ 
able  for  some  other  reason,  the  total  in  the  sample  was  185.  This 
sample  was  tested  for  representativeness  of  the  population  in  terms  of 
the  following  variables: 

1.  School  and  College  Ability  Tests  (SCAT)  Verbal  Scores.  (This  will 
be  referred  to  as  SCAT  Verbal) 

2.  School  and  College  Ability  Tests  (SCAT)  Quantitative  Scores.  (This 
will  be  referred  to  as  SCAT  Quant.  ) 

3.  Science  9A  .  (This  is  the  machine-scored  objective  test  section  of 
the  Departmental  Examination  for  Grade  IX  Science.  It  is  used 
separately  from  Section  B,  in  view  of  the  fact  that  Section  A  was 
adopted  as  a  measure  primarily  of  recall.  For  convenience  in 
reading  the  tables,  this  will  be  referred  to  as  Science  9A.  ) 

4.  Science  9B.  (This  was  the  score  on  the  essay  and  problem  section 
of  the  Departmental  Examination  in  Grade  IX  Science;  it  will  be 
referred  to  as  Science  9B  in  the  tables.  ) 
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5.  Age. 

6.  Sex. 

The  SCAT  scores  and  the  Grade  IX  Science  marks  were  used  to 
equate  the  groups  in  terms  of  general  intellectual  ability,  and  previous 
science  achievement  respectively.  Age  was  thought  to  be  importarlt 
because  it  is  likely  that  responses  to  the  attitude  scales  by  a  given  stu¬ 
dent  vary  with  age.  The  work  of  Meyer  and  Penfold  (1961)  and  others 
has  indicated  that  there  is  a  sex  difference  in  science  interest  and  in 
view  of  this  sex  was  included  as  a  variable  upon  which  the  groups  were 
equated.  To  carry  out  this  test  for  representativeness,  a  random  sample 
was  taken  from  the  total  population,  that  is,  from  all  Science  X  classes 
in  the  six  Edmonton  high  schools.  This  was  done  by  selecting  eight 
students  from  each  class  with  the  aid  of  a  table  of  random  numbers. 
Individuals  in  grade  XI  and  those  for  whom  the  necessary  information  was 
not  available  were  eliminated  and  then  six  were  chosen  from  the  remainder, 
except  in  cases  where  less  than  six  remained  after  the  elimination.  This 
process  resulted  in  a  reference  group  of  375  students  representing  sixty 
eight  classes.  When  the  sample  under  investigation  was  compared  with 
the  random  sample  it  was  found  that  the  former  contained  too  many  stu¬ 
dents  in  the  high  ability  range.  Further  selection  from  the  185  students 
produced  a  sample  of  166  subjects.  The  information  showing  the  repre¬ 
sentative  nature  of  this  group  in  terms  of  the  selected  variables  is  sum¬ 
marized  in  Table  I. 

Selected  Tests 

As  was  pointed  out  in  Chapter  III,  it  has  been  recommended  that, 
in  order  to  facilitate  the  interpretation  of  a  factor  analysis,  three  tests 
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COMPARISON  OF  THE  REPRESENTATIVE  SAMPLE  WITH  A  RANDOM  SAMPLE 
OF  SCIENCE  X  STUDENTS  IN  TERMS  OF  SIX  SELECTED  VARIABLES 
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of  a  hypothesized  factor  should  be  included.  Another  recommendation 
was  that  tests  of  verbal,  number,  spatial,  and  inductive  reasoning 
abilities  should  be  included.  Testing  time  available  in  the  schools  placed 
a  limitation  on  the  number  of  tests  which  could  be  administered.  Because 
of  this,  only  one  verbal  test  was  administered  and  two  number  and  two 
spatial  tests.  It  was  believed  that  verbal  and  number  ability  would  be 
involved  in  other  of  the  tests  and  that  the  above  number  of  tests  would  be 
adequate  especially  in  view  of  the  fact  that  verbal  a'nd  quantitative  scores 
on  the  scholastic  altitude  test  were  available  from  testing  at  the  end  of 
grade  IX.  In  view  of  the  evidence  concerning  memory  factors,  which  was 
summarized  in  Chapter  II,  it  was  decided  not  to  include  any  tests  of 
memory.  It  seemed  likely  that  the  kind  of  memory  involved  in  the  science 
test  performances  would  be  more  related  to  the  verbal  factor  than  to  rote 
memory  factors.  Limitations  on  testing  time  prevented  the  inclusion  of 
any  measures  of  science  interest,  although  it  would  have  been  useful  to 
examine  the  relation  of  attitudes  and  understanding  of  science  to  interest. 

A  Test  of  Science  Knowledge 

A  conclusion  which  one  could  draw  from  the  criticisms  which  have 
been  cited,  that  many  science  examinations  involve  evaluation  of  factual 
knowledge  predominantly,  is  that  such  tests  require  simple  recall  on  the 
part  of  the  subject,  whereas  problem-solving  tests  require  reasoning 
processes  in  addition  to  recall.  It  was  desirable  then,  to  include  a  test 
which  seemed  to  involve  recall  of  facts  and  principles  with  little  reasoning 
so  that  along  with  the  reasoning  tests  it  would  provide  a  basis  of  compari¬ 
son  for  the  problem-solving  tests. 
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The  machine  scored  objective  test  section  of  the  Grade  IX 
Science  Examination  set  by  the  High  School  Entrance  Examination 
Board  (1961)  appeared  to  be  largely  of  this  type.  The  raw  scores  on 
this  test  were  made  available  by  the  Department  of  Education  and  were 
included,  with  the  above  purpose  in  mind.  *  The  test  was  made  up  of 
forty  items  of  the  multiple  choice  variety,  each  item  having  five 
suggested  answers.  These  were  expressed  quite  briefly,  for  example, 
(P.  2), 

"4.  The  percentage  of  oxygen  in  the  air  4,  1,  21%  2.  78% 

is  about  3.  '1%  4.  99% 

5.  2.  4%" 

Of  the  forty  eight  items,  it  was  judged  that  only  eleven  required  some 
simple  application  of  principle,  the  remainder  involved  recall  only. 

Tests  of  Problem- solving  Ability 

Sequential  Tests  of  Educational  Progress  (STEP),  Science,  Form  2A 
This  test  is  published  by  the  Educational  Testing  Service  (1957).  The 
manual  (1957)  claimed  that  the  test  measures  six  types  of  scientific 
reasoning.  These  have  been  mentioned  in  Chapter  II.  They  will  be 
discussed  again  in  Chapter  VI,  and  will  not  be  listed  here.  The  allo¬ 
cation  of  items  to  each  category  is  given  in  the  Teacher's  Guide  (1959). 
The  items  of  the  test  are  of  the  multiple  choice  kind  with  four  suggested 
answers  in  each  case.  These  items  are  arranged  in  ten  groups,  each 
group  being  related  to  a  certain  situation,  for  example,  Bill  and  Jim 
building  a  car,  or  the  contruction  of  a  superhighway  near  your  house. 

*The  writer  is  indebted  to  Mr.  V.  R.  Nyberg,  Co-ordinator  of 
Testing  and  Research,  Department  of  Education,  Alberta,  for  making 
this  information  available. 
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The  test  is  divided  into  two  parts  for  administrative  purposes  and  the 
scores  on  these  two  parts  were  recorded  separately,  giving  two  sub¬ 
tests  for  STEP  Science. 

The  STEP  tests  have  been  praised  by  Anastasi  (1  961 )  for  good 
construction  and  as  being  well  related  to  objectives.  They  have  been 
cited  by  Mills  and  Dean  (I960)  as  an  example  of  objective  tests  of 
problem-solving  and  the  authors  of  the  National  Science  Teachers' 
Association  Report  to  members  attending  its  Tenth  Annual  Convention 
(1962)  pointed  out  that  the  STEP  Science  tests  appear  to  support  the 
point  of  view  that  problem-solving  skills  can  be  measured  effectively. 
The  reviewer  in  the  Mental  Measurement  Yearbook  (1959)  said  that 
"the  test  situation  is  a  problem  situation.  "  (1959,  p.  714).  The  main 
criticism  offered  by  the  latter  reviewer  was  that  more  factual  content 
would  be  helpful.  The  "generalness"  of  the  tests,  however,  is  neces¬ 
sary  for  their  wide  applicability.  Mention  has  previously  been  made  of 
the  fact  that  a  recent  investigation  by  Gega  and  Karlsen  suggests  that 
the  actual  format,  that  is  the  situational  casting  of  the  items  as  opposed 
to  non- situational  casting,  does  not  seem  to  make  any  difference  to  the 
scores  obtained  by  subjects. 

The  Technical  Report  (1957)  gave  a  reliability,  measured  by  the 
Kuder-Richardson  Formula  20,  of  0.81.  The  publishers  expected  to 
conduct  validity  studies  but  the  only  information  they  have  provided  as 
yet,  apart  from  face  validity,  is  that,  according  to  the  1958  SCAT-STEP 
Supplement,  a  group  of  271  seventh  grade  students  showed  a  correlation 
of  0.  66  between  STEP  Science  and  school  science  grades.  No  informa¬ 


tion  is  given  about  the  basis  upon  which  the  school  grades  were  determined. 
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Therefore  it  is  not  possible  to  draw  conclusions  from  it  concerning  the 
validity  of  STEP  Science  as  a  measure  of  problem-solving  ability.  An 
analysis  of  the  test,  form  2A,  in  terms  of  the  subject  matter  covered, 
indicated  that  it  would  be  satisfactory  in  this  regard.  There  was  also 
evidence  that  it  had  proved  satisfactory  in  an  investigation  in  the  Clover 
Bar  subdivision  in  Alberta  conducted  by  D.  B„  Black,  * 

Tests  of  Understanding  Science 

Test  of  Understanding  Science  (TOUS).Form  W.  **  This  test  was  developed 

at  the  Harvard  University  Graduate  School  of  Education  by  W.  W.  Cooley 

and  L.  E.  Klopfer,  and  was  published  by  the  Educational  Testing  Service 

(1961).  In  the  words  of  the  Manual  (1961)  it  is  "a  research  instrument  to 

measure  high  school  students'  understandings  of  science  and  scientists.  " 

(1961,  p.  1).  It  is  claimed  that  three  areas  are  covered.  Area  I  (18 

items)  Scientific  Enterprise,  including  the  human  element  in  science, 

communication  among  scientists,  scientific  societies,  instruments,  money, 

international  character  of  science  and  interaction  of  science  and  society, 
as  its  themes.  Area  II  (18  items)  The  Scientist.  Generalizations  about 

scientists  as  people,  institutional  pressures  on  scientists,  abilities  needed 

by  scientists,  are  the  themes  in  this  area.  Area  III  (24  items)  Methods 

and  Aims  of  Science,  has  as  its  themes  generalities  about  scientific  methods, 


^Personal  communication  from  Dr.  D.  B.  Black,  Division  of 
Educational  Psychology,  Faculty  of  Education,  University  of  Alberta. 

**The  investigator  acknowledges  his  appreciation  to  the  Educa¬ 
tional  Testing  Service  and  the  authors  of  this  test  for  permission  to  use 
it  in  the  present  study. 
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tactics  and  strategy  of  sciencing,  theories  and  models,  aims  of  science, 
accumulation  and  falsification,  controversies  in  science,  science  and 
technology,  unity  and  interdependence  of  the  sciences. 

The  sixty  items  are  of  the  multiple  choice,  four  response, 
type.  Seven  items  in  Area  II  are  somewhat  different  from  the  others 
in  format;  these  give  a  statement  and  an  accompanying  reason  and  the 
subject  must  select  one  response  which  covers  true  or  false  on  both 
statement  and  reason;  that  is,  there  are  four  possible  choices  of  which 
one  is  correct.  Four  keys  are  provided  yielding  scores  for  the  three 
areas  mentioned  and  a  total  score. 

Because  of  the  recency  of  the  development  of -this  test  there 
are  no  evaluations  of  it  available  in  the  literature. 

The  manual  (1961)  gave  the  following  reliabilities  determined 
by  application  of  the  Kuder-Richardson  20  formula:  Area  I:  0.  58, 

Area  II:  0.  52,  Area  III:  0.  58,  total:  0.  76.  The  standard  error  of 
measurement  for  the  total  score  is  3.  49.  The  evidence  for  the  validity 
of  the  instrument  is  mainly  in  terms  of  face  validity  at  present,  inclu¬ 
sion  of  themes  and  content  being  decided  after  discussion  with  consult¬ 
ants.  Some  further  evidence  comes  from  an  analysis  of  responses  of 
seventy  eight  talented  high  school  students  in  two  summer  programs 
in  which  the  students  were  in  active  contact  with  working  scientists.  An 
earlier  form  of  the  test  was  administered  to  these  students  before  and 
after  the  summer  program,  and  significant  changes  were  observed  to¬ 
wards  the  desired  "correct"  responses  on  the  second  testing.  No 
significant  change  was  observed  in  the  responses  of  a  control  group  who 
did  not  participate  in  the  summer  program. 
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The  test  was  found  to  correlate  0,  69  with  intelligence  as 
measured  by  the  Otis  Mental  Ability  Test. 

Tests  of  Scientific  Attitudes. 

Rokeach's  Dogmatism  Scale.  The  scale  purports  to  be  a  measure  of 
the  openness  or  closedness  of  a  person’s  belief  systems  and  as  such 
it  also  serves  to  measure  general  authoritarianism  and  general  intoler- 
ance.  Each  statement  was  designed  to  avoid  specific  ideological  posi¬ 
tions,  and  the  forty  items  of  the  scale  are  couched  in  language  and 
express  ideas  which  are  familiar  to  the  average  person  in  his  day  to 
day  life,  "There  are  a  number  of  people  I  have  come  to  hate  because 
of  things  they  stand  for,  "  is  an  example  of  a  statement,  and  a  favorable 
response  to  this  item  would  indicate  authoritarianism.  The  response 
to  each  statement  is  given  by  a  numerical  rating  using  a  scale  from  +3 
(I  agree  very  much)  to  -3  (I  disagree  very  much)  with  no  zero  position. 
Scoring  is  carried  out  by  adding  4  to  the  value  given  for  each  statement 
and  then  summing  up  all  values. 

Reliability  coefficients  for  the  scale  were  reported  from 
studies  of  a  number  of  college  groups  in  the  United  States,  English 
college  groups,  English  workers,  and  a  United  States  Veteran  Affairs 
domiciliary.  The  values  range  from  0.  68  to  0.  93  with  values  about 
0.  8  0  predominating.  There  is  evidence  of  the  validity  of  the  instrument 
on  the  basis  of  its  differentiation  of  groups  of  college  students  rated  by 
their  peers  as  high  and  low  on  dogmatism.  Further  evidence  comes 
from  studies  with  political  and  religious  groups. 

It  was  thought  to  be  important  to  try  to  determine  whether 
or  not  the  scale  would  be  suitable  for  use  with  grade  X  students  as  the 
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earlier  use  of  it  had  been  with  older  subjects.  In  a  pilot  study  the  scale 
was  administered  to  a  grade  X  class  which  was  not  to  be  part  of  the 
testing  project.  For  this  group  a  mean  of  157.  5  and  a  standard  devia¬ 
tion  of  31.  65  were  obtained;  these  were  not  significantly  different  from 
typical  values  obtained  from  U.  S.  college  groups.  An  examination  of 
the  responses  showed  that  students  had  used  the  whole  rating  range 
quite  freely,  and  there  was  no  tendency  to  use  one  value,  say  +3  or  +1, 
excessively.  A  reliability  of  0.  79,  by  the  split-half  method  with  the 
Spearman-Brown  correction,  was  obtained  in  this  study.  Further, 
discussion  with  the  students  indicated  that  the  scale  was  suitable  in 
terms  of  its  vocabulary  content  and  that  the  student  reaction  to  it  was 
satisfactory,  in  that  they  thought  grade  X  students  would  be  quite  cooperat¬ 
ive  in  responding  to  it. 

Mailer  and  Lundeen  Superstition  Test.  The  test  described  by  Mailer 
and  Lundeen  (1934)  consisted  of  fifty  items,  each  describing  how  some 
person  acts  or  feels  in  certain  life  situations,  for  example,  "Prefers 
to  have  nothing  to  do  with  the  number  13.  11  The  individual  was  asked  to 
indicate  whether  he  felt  or  acted  the  same  way  or  differently.  The  test 
was  used  by  the  authors  at  the  grade  VII  level.  The  reliability,  deter¬ 
mined  by  the  split-half  method,  was  given  as  0.  81.  The  authors  based 
a  claim  for  validity  on  the  assumption  of  a  truthful  response  by  the 
subjects.  Correlation  with  intelligence  was  given  as  -0.  17.  The  work 
of  Zapf  (1945,  a)  has  been  mentioned  in  Chapter  II.  An  interesting 
feature  of  the  test  she  used  was  the  method  of  scoring,  in  which  each 
item  was  marked  in  one  of  four  ways  (a)  heard  before  and  believed; 

(b)  heard  before  and  not  believed;  (e)  not  heard  before  and  believed; 
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(d)  not  heard  before  and  not  believed.  The  reliability  of  this  test,  as 
measured  by  the  test-retest  method  with  a  two  week  interval  between, 
was  0.  88. 

In  the  light  of  Zapf's  work  it  was  decided  to  use  the  fifty 
items  of  the  Mailer  and  Lundeen  test  but  to  modify  the  scoring  proced¬ 
ure  in  an  attempt  to  obtain  a  truer  picture  of  student  response.  Placed 
beside  the  items  on  the  booklet  there  were  three  columns  labelled 
"heard,  "  "belief  or  action,  "  "influenced";  these  were  explained  by  an 
example,  and  the  subject  was  asked  to  check  these  as  appropriate  in 
each  case.  In  regard  to  "influenced,  "  the  student  was  asked  to  decide 
whether,  in  his  actions,  he  was  influenced  by  the  statement.  It  was 
believed  that  in  the  attempt  to  evaluate  his  behaviour  objectively  the 
student  may  give  a  fairer  assessment  of  himself  especially  when,  in 
this  test  as  in  the  other  attitude  measures,  he  was  assured  that  his 
answer  sheet  would  be  seen  only  by  the  investigator.  A  difficulty  with 
this  answering  procedure  was  that  it  was  time  consuming.  The  demands 
of  the  testing  program  made  it  necessary  to  ask  the  student  to  complete 
as  many  items  as  possible  in  the  time  available  without  too  much  haste. 
On  examination  of  the  papers  of  the  students  in  the  sample  under  investi¬ 
gation  it  was  found  that  the  first  thirty  items  could  be  used  as  a  basis 
for  a  score.  It  was  felt  that  the  only  way  normative  scores  could  be 
obtained  was  to  use  the  number  marked  "heard"  by  each  student  as  his 
maximum  possible  score.  Scores  for  "belief  or  action"  and  for 
"influenced"  were  obtained  by  expressing  the  number  of  positive 
responses  under  these  two  headings  as  a  percentage  of  the  number 
marked  "heard.  "  Thus  two  sets  of  scores  were  obtained,  and  these 
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are  referred  to  in  the  analysis  as  Scientific  Attitude  Scales  IIIA  and 
MB.  The  reliability  figures  for  the  Mailer  and  Lundeen  test  and  for 
the  Zapf  test  seemed  to  be  consistent  on  the  basis  that  one  was  twice 
as  long  as  the  other.  It  was  believed  then  that  the  reliability  of  the 
test  used  in  this  investigation  could  be  based  on  its  being  approximately 
half  as  long  as  that  of  Mailer  and  Lundeen.  This  gives  a  figure  of 
about  0.  7, 

Reference  Tests 

Induction;-  Standard  Prdgressive  Matrices,  Sets  A,  B,  C,  D,  and  E, 
(1956),  This  is  a  well  established  test  prepared  by  J.  C.  Raven.  It 
consists  of  matrix  items  which  are  two-way  figure  analogy  problems. 
One  principle  controls  the  changes  from  left  to  right  in  the  figures 
and  another  controls  changes  from  top  to  bottom.  The  pattern  is 
incomplete  in  each  case  and  the  subject  selects  the  one  design  which 
will  complete  it,  from  several  available  designs.  The  items  are 
arranged  in  increasing  difficulty,  and  there  are  sixty  in  all.  It  is 
generally  regarded  as  a  measure  of  the  inductive  reasoning  factor. 
Anastasi  (1961)  said  that  it  is  regarded  by  British  psychologists  as  the 
best  available  measure  of  g,  "requiring  chiefly  the  eduction  of  rela¬ 
tions.  "  (1961,  p.  261).  Cronbach  (I960)  regarded  it  as  "one  of  the 

best  available  techniques  for  obtaining  a  non-verbal  measure  of 
reasoning  ability."  (I960,  p.  438). 

Values  of  the  reliability  coefficient  ranging  from  0.  70  to 
0.  90  were  given  by  Anastasi  (1961)  for  older  children  and  adults. 

The  test  is  usually  given  as  a  power  test,  the  time  required 
being  about  forty  five  minutes,  but  it  has  also  been  used  with  a  time 
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limit  of  twenty  minutes,  especially  in  extensive  use  in  the  British 
Army,  with  quite  satisfactory  results  according  to  Vernon  and  Parry 
(1949).  They  gave  a  value  for  its  reliability  coefficient  of  0.88  when 
used  under  these  conditions.  Because  of  the  significance  of  reasoning 
in  this  investigation  it  was  thought  that  it  was  better  to  include  this  well- 
established  and  extensive  test,  with  a  time  limit,  than  one  of  the  briefer 
tests  of  induction  which  are  available,  but  which  are  also  speeded. 

Deduction:  Holzinger-Crowder  Uni-Factor  Tests:  Test  9,  Teams. 

This  is  one  of  nine  tests  in  this  set  published  in  1955.  According  to 
the  manual  (1955)  the  set  measure  "four  types  of  mental  activity  or 
aspects  of  mental  ability,  designated  Verbal,  Spatial,  Numerical,  and 
Reasoning.  "  (1955,  p.  1).  The  tests  were  designed  to  be  as  pure  as 
possible  as  measures  of  the  four  abilities,  that  is,  with  little  overlap 
between  them.  Factorial  analysis  results  given  in  the  manual  indicate 
that  the  tests  for  a  given  factor  have  high  loading  on  that  factor  but  low 
loadings  on  other  factors.  The  Teams  Test  is  concerned  with  syllogistic 
reasoning.  It  consists  of  several  groups  of  facts  and  the  subject  must 
decide  whether  or  not  a  set  of  conclusions  follows  from  the  facts  in 
each  case. 

The  reviewers  in  the  Mental  Measurement  Yearbook  (1959) 
agreed  on  the  high  technical  quality  of  these  tests,  though  they  com¬ 
mented  on  the  narrowness  of  the  content:,  and  the  question  was  raised 
whether  the  factorial  purity  may  not  be  due  partly  to  this  narrowness 
of  content  and  to  the  highly  speeded  nature  of  most  of  the  tests.  The 
reasoning  test  under  discussion  is  not  so  highly  speeded  as  some  of 


the  others. 
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The  manual  only  gave  reliability  data  for  the  total  reasoning 
score  from  three  reasoning  tests.  The  figure  was  about  0.  90,  by  the 
split-half  method,  for  the  different  grades.  It  is  likely  that  the  reli¬ 
ability  of  each  test  would  be  at  least  0.  70  and  probably  about  0.  80. 

The  only  evidence  relevant  to  the  validity  of  the  test  lies  in  correla¬ 
tions  ranging  from  0.  60  to  0.  72  between  the  total  Holzinger- Crowder 
Reasoning  score  and  various  intelligence  tests. 

General  Reasoning:  Mathematics  Aptitude  Test.  This  is  a  test  from 
the  Educational  Testing  Service  Kit  of  Selected  Tests  (1^54).  The 
special  kit  was  prepared  to  provide  reference  tests  for  factor  analyses. 
This  particular  test  was  taken  from  the  reasoning  section  of  the  Ameri¬ 
can  Council  on  Education  Psychological  Examination.  According  to  the 
manual  (1954),  it  has  been  found  that  this  type  of  test,  along  with  vari¬ 
ous  non- mathematical  tests,  loads  on  a  reasoning  factor  which  appears 
to  be  separate  from  mathematics  achievement.  The  factor  has  been 
called  general  reasoning.  The  test  consisted  of  twenty  items  each  one 
being  an  arithmetical  problem  expressed  in  verbal  form. 

No  information  about  the  reliability  of  the  test  is  available 
but  it  is  expected  that  it  would  be  of  the  order  of  0.  80.  The  test  is 
similar  to  Part  IV  of  the  SCAT  test  for  which  a  reliability  of  0.  82  was 
given  by  the  manual  (1955).  Though  there  is  no  guarantee  that  the 
construction  was  done  so  thoroughly  as  in  the  case  of  the  SCAT  test, 
the  value  of  0.  82  may  be  regarded  as  an  indication  of  the  reliability. 
Owing  to  the  fact  that  the  test  was  speeded,  it  was  not  possible  to 
determine  its  reliability  from  the  project  without  re-administering 


it  to  some  of  the  students. 
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Tests  of  Number  Facility.  Two  tests  from  the  Educational  Testing 
Service  Kit  (1954)  were  selected  as  measures  of  this  "facility  in  hand¬ 
ling  numbers  in  arithmetical  operations"  (1954,  p.  20),  a  factor  which 
has  been  clear  in  many  analyses. 

(a)  Addition  Test:  The  ninety  items  require  simple  addition  operations. 

(b)  Subtraction  and  multiplication  Test:  forty  five  subtraction  and  forty 
five  multiplication  items  made  up  this  test. 

Again  no  information  is  available  about  the  reliability  coeffi¬ 
cients  but  it  was  expected  to  be  high  in  view  of  the  simple  nature  of  the 
tests.  Because  the  tests  were  highly  speeded,  reliability  coefficients 
could  not  be  determined  in  this  project  but  the  fact  that  the  correlation 
coefficient  for  the  two  tests  was  0.  71  would  indicate  that  each  test 
probably  would  have  a  reliability  in  excess  of  this  figure. 

Test  of  Verbal  Ability:  Vocabulary.  This  test  is  also  from  the  ETS 
Kit.  It  was  adapted  from  the  Cooperative  Vocabulary  Test  and  is 
representative  of  tests  which  load  heavily  on  the  verbal  factor.  It 
consists  of  thirty  six  five-choice  synonym  items.  Again  reliability 
data  was  not  available  but  by  comparison  with  the  SCAT  Verbal  test  it 
was  estimated  to  be  at  least  of  the  order  of  0.  80. 

Holzinger- Crowder  Uni-Factor  Tests:  Test  3  (Boots)  and  Test  4 
(Hatchets).  These  tests  were  designed  to  give  a  measure  of  spatial 
ability.  Each  consists  of  seventy  items  in  which  the  examinee  decides 
whether  two  pictured  boots  (or  hatchets  as  the  case  may  be)  are  viewed 
from  the  same  side  or  from  different  sides.  The  tests  are  highly 
speeded  and  according  to  the  manual  "what  is  measured  is  a  kind  of 
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facility  or  adeptness  in  the  manipulation  of  images  in  two-dimensional 
space.  (1955,  p.  1).  The  comments  already  made  about  these  tests 
in  general,  including  the  reports  of  reviewers,  apply  to  the  spatial 
tests  as  well  as  to  the  reasoning  tests. 

Owing  to  delay  in  delivery  of  the  tests,  it  was  necessary  to 
use  those  which  were  available  with  separate  answer  sheets,  rather 
than  having  the  students  report  their  responses  on  the  test -item  sheets, 
as  is  intended.  This  change  would  involve  extra  time  and  so  three 
minutes  were  allowed  instead  of  two  and  a  half  as  working  time.  It 
was  acknowledged  that  the  changed  conditions  could  alter  the  results 
of  these  two  tests,  for  example,  some  kind  of  clerical  factor  might  be 
introduced.  However,  the  purpose  here  was  not  to  compare  perform¬ 
ance  with  that  on  other  spatial  tests  but  merely  to  define  a  factor,  and 
the  modification  seemed  to  be  unimportant  in  this  regard  so  long  as 
the  new  procedure  was  standardized  for  all  subjects.  Care  was  taken 
to  do  this. 

Reliability  coefficients  given  by  the  publishers  are  of  the  order 
0.  8  3.  The  fact  that  the  correlation  between  the  two  spatial  tests  in 
the  present  investigation  was  given  by  a  coefficient  of  0.  79  would  indi¬ 
cate  that  the  reliability  here  was  much  the  same  as  the  quoted  value. 

Cooperative  School  and  College  Ability  Tests  (SCAT).  (1956,  1957). 

Published  by  the  Educational  Testing  Service,  this  test  reflects  a 
shift  which  Cronbach  (I960)  noted,  away  from  emphasis  upon  measur¬ 
ing  "mental  ability"  to  a  greater  concern  with  school  learned  abilities 
when  one's  aim  is  to  predict  school  success.  According  to  the  manual 
the  main  purpose  of  the  test  is,  "to  estimate  the  capacity  of  each 
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individual  student  to  undertake  the  academic  work  of  the  next  higher 
level  of  schooling"  (1  955,  p.  3).  The  test  is  divided  into  four  parts. 
Parts  I  and  III  are  concerned  with  "developed  ability  in  skills  that 
are  closely  related  to  student  success  in  the  verbal  kinds  of  school 
learning.  "  (1955,  p.  3).  Parts  II  and  IV  are  "measures  of  ability 
in  certain  quantitative  skills  of  number  manipulation  and  problem 
solving"  (1955,  p.  3).  Thus  the  test  gives  a  Verbal  Score  and  a 
Quantitative  Score,  and  these  may  be  combined  to  give  a  total  score 
if  desired.  The  items  are  all  of  the  five  response  multiple  choice  type. 
The  verbal  measure  is  made  up  of  sentence  completion  items  (Part  I) 
and  vocabulary  (Part  III);  numerical  computation  (Part  II)  and  numeri¬ 
cal  problem-solving  items  (Part  IV)  make  up  the  quantitative  measure. 

The  reviewers  in  The  Fifth  Mental  Measurements  Yearbook 
(1959)  were  generally  agreed  concerning  the  high  quality  of  the  con¬ 
struction  of  these  tests.  They  differed  in  regard  to  the  nature  of 
their  criticisms,  ranging  over  such  matters  as  the  purpose  of  the 
tests,  the  value  of  the  norms  data  and  the  accuracy  of  the  Kuder- 
Richardson  values  as  estimates  of  the  reliability.  This  latter  topic 
is  the  only  one  which  concerns  the  present  investigation  to  any  extent. 
There  is  the  suggestion  that  the  values  may  be  somewhat  in  error 
because  of  a  degree  of  speededness  in  the  tests.  It  would  seem, 
however,  that  the  reliabilities  would  be  still  quite  high  even  if  the 
stated  values  are  slightly  inflated.  The  reliability  estimates  given 
by  the  manual  are  0.  93  for  the  Verbal  Score  and  0.  91  for  the  Quanti¬ 
tative  Score.  The  publishers  gave  evidence  of  the  content  and  con¬ 
current  validity  of  the  test  in  the  manual.  (1955),  and  evidence  of  its 
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predictive  validity  in  a  later  supplement  (1958). 

These  tests  had  been  administered  to  all  grade  IX  students  in 
Edmonton  at  the  end  of  their  1960-1961  school  year.  The  raw  scores 
were  made  available  by  the  Alberta  Department  of  Education. 

School  Science  Tests. 

- 3 - 

Departmental  Examinations,  1961.  Grade  IX.  Science.  This  exami¬ 
nation  was  constructed  for  the  Alberta  Department  of  Education  by  the 
High  School  Entrance  Examination  Board.  It  consisted  of  two  sections. 
The  first,  of  an  objective  type,  has  already  been  discussed  under  the 
subheading  "A  Test  of  Science  Knowledge.  !l  The  second  section  con¬ 
sists  of  various  kinds  of  questions  including  essay,  completion,  match¬ 
ing,  and  problem  types.  Also  obtained  from  the  Department  of  Education 
was  a  total  score  for  each  subject.  To  obtain  this,  the  scores  on  the 
two  sections  of  the  examination  had  been  combined  with  a  school  esti¬ 
mate  and  the  distribution  of  the  resulting  total  for  all  Alberta  students 
fitted  to  a  normal  curve. 

Science  X  School  Grades.  These  grades  were  determined  by  the  class 
teachers  on  the  basis  of  examinations  written  on  several  occasions 
during  the  grade  X  year.  Thus  the  scores  were  not  the  result  of  a 
single  test,  nor  were  they  the  product  of  the  same  set  of  tests  as 
these  differed  with  the  different  schools.  However  they  were  all 
expressed  on  a  scale  from  0  to  100. 

Preliminary  Investigations 

Several  minor  projects  were  carried  out  to  prepare  the  tests 
for  the  main  investigation.  Mention  already  has  been  made  of  the 
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trial  of  the  Rokeach  Dogmatism  Scale  to  check  its  suitability  for  the  pre¬ 
sent  investigation.  Other  preliminary  testing  was  necessary  in  the  con¬ 
struction  of  two  tests.  It  has  been  noted  in  earlier  chapters  that  authori¬ 
ties  recommend  the  inclusion  of  three  tests  which  are  judged  to  be  mea¬ 
sures  of  a  factor  which  is  hypothesized.  This  necessitated  the  construc¬ 
tion  of  a  test  of  problem-solving  ability.  Also,  since  no  suitable  test  of 
curiosity  could  be  found  in  the  literature  a  test  was  constructed  in  an 
attempt  to  measure  this  attitude. 

There  are  published  tests  other  than  STEP  Science  which  seem  to 
be  related  to  problem-solving  ability,  for  example  the  manual  for  the 
Iowa  Tests  of  Educational  Development  claims  that  these  tests  measure 
ability  to  do  critical  thinking  and  to  apply  principles  to  new  situations. 
However  time  limitations  made  it  impossible  to  include  such  tests  as 
well  as  STEP  Science  and  in  view  of  the  more  favorable  comments  about 
the  latter  by  reviewers,  it  was  believed  that  it  was  the  most  suitable  to 
be  used.  It  was  thought  that  its  two  sections  could  be  used  as  subtests. 

If  an  unique  problem-solving  ability  is  measured  by  the  test,  this  vari¬ 
able  would  surely  be  operative  in  the  two  halves  of  a  test  of  such  length, 
there  being  thrity  items  in  each  section.  To  provide  a  third  test  of  this 
kind  it  was  decided  to  construct  a  test  similar  to  the  STEP  Science  test 
and  such  that  it  could  be  completed  by  the  subjects  in  one  class  period. 
The  realization  that  a  test  constructed  within  a  limited  amount  of  time 
would  almost  certainly  be  lower  in  reliability  than  one  prod\ic.ed  with 
such  a  vast  expense  of  effort  and  resources  as  the  STEP  test,  was  to 
some  extent  offset  by  the  fact  that  it  would  be  more  closely  related  to 
the  content  of  courses  in  which,  the  students  had  participated.  It  was 
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thought  that  it  would  be  adequate  to  confirm  a  factor  which  may  be 
defined  by  the  two  STEP  Science  subtests. 

The  Testing  Program 

Discussion  with  the  cooperating  teachers  led  to  a  program  in  which 
the  testing  was  spread  over  three  months.  Two  periods  were  taken  in 
April,  three  in  May,  and  two  in  June.  The  periods  in  a  given  month  were 
taken  close  together,  for  example  on  succeeding  days.  It  was  possible 
for  the  investigator  to  administer  all  the  tests  thus  reducing  possible 
variations  in  procedures  which  might  occur  with  different  personnel 
doing  the  testing. 

Prior  to  the  first  testing  period  the  investigator  visited  the  classes 
concerned  and  spoke  to  the  students  about  the  project  in  order  to  seek 
their  cooperation.  After  being  introduced  by  the  class  teacher,  he  intro¬ 
duced  the  project  in  the  following  terms: 


We  hear  a  great  deal  these  days  about  research  in  space  and 
other  scientific  fields  of  study.  The  Science  Education  Depart¬ 
ment  at  the  University  is  carrying  out  a  research  project  in 
which  we  are  investigating  some  special  tests  for  science  teach¬ 
ing.  Mr.  (name  of  School  Principal),  Mr.  (Head  of  Science 
Department)  and  Mr.  (Class  Teacher)  are  helping  us  with  this 
project,  and  we  would  like  you,  along  with  two  other  classes  at 
(name  of  School)  and  three  classes  in  each  of  two  other  schools, 
to  try  out  these  tests.  To  help  us  in  the  investigations  we  need 
information  from  other  tests  and  some  of  these  will  be  similar 
to  ones  you  have  done  before,  some  of  them  will  be  different. 

The  first  one  in  fact  involves  no  words  and  is  more  like  a 
puzzle  than  the  ordinary  test. 

The  scores  you  make  will  be  used  by  us  to  investigate  the 
usefulness  of  these  tests  in  science  teaching  and  some  of  them 
will  provide  your  teachers  with  information  about  your  abilities 
which  will  be  useful  in  guiding  you  in  your  future  schoolwork. 

The  scores  however  will  in  no  way  effect  your  school  grades 
and  no  preparation  is  necessary  on  your  part.  By  doing  as 
well  as  you  possibly  can  on  the  tests  you  will  be  helping  your 
teachers,  yourselves,  future  science  students  and  us  in  our 
investigation. 
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The  first  tests  will  be  given  to-morrow  and  Thursday  and  the 
others  from  time  to  time  between  now  and  the  end  of  the  year,  I 
think  you  will  find  them  interesting,  and  from  some  of  them  you 
will  learn  new  things. 

In  two  of  the  schools  the  class  periods  were  thirty  eight  minutes 
in  length  and  in  the  third  school  fifty  minutes.  In  order  to  make  the 
maximum  use  of  time,  test  booklets  and  answer  sheets  were  distributed 
before  the  students  entered  their  room.  Where  more  than  one  test  was 
being  administered  during  a  period,  the  tests  were  usually  stapled  into 
one  booklet. 

The  STEP  Science  test  was  administered  in  two  periods,  Sections 
I  and  II  being  separated  for  this  purpose.  It  was  necessary  to  cut  the 
actual  working  time  from  thirty  five  to  thirty  three  minutes,  but  the 
same  procedure  was  followed  in  all  groups.  Also  this  is  designed  as 
a  power  test  and  in  the  present  case  it  was  found  that  students  had  ample 
time  to  finish. 

Because  of  the  experimental  nature  of  the  Test  of  Understanding 
Science,  and  because  of  an  agreement  with  its  authors  and  publishers, 
the  administration  followed  the  outline  in  the  manual  without  modifica¬ 
tion.  In  order  to  achieve  this  the  schools  made  special  arrangements 
where  necessary  to  allow  the  amount  of  time  required. 

In  the  case  of  the  attitude  scales  care  was  taken  to  assure  the 
subjects  of  the  confidential  nature  of  their  responses.  In  addition  to 
the  precise  instructions  on  the  scale,  a  statement  was  made  that, 
though  a  score  would  be  returned  to  the  school,  the  answer  sheets 
would  be  seen  only  by  the  investigator.  It  was  thought  that  greater 
objectivity  in  the  responses  may  be  obtained  by  eliminating  the  actual 
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titles  of  the  attitude  scales  from  the  test  form.  They  were  identified, 
accordingly,  by  the  titles  Scientific  Attitude  Scale  I,  II,  or  III  as  the 
case  may  be. 

Investigations  to  Test  the  Hypotheses 

The  investigation  of  the  data  involved  several  factor  analyses. 

As  a  first  step,  an  anlysis  of  all  the  subtests,  along  with  the  reference 
variables  was  carried  out.  Because  this  was  especially  concerned  with 
the  exploratory  aspects  of  the  study,  the  analysis  was  carried  out  upon 
the  results  of  the  185  students  for  whom  a  complete  set  of  data  was 
available. 

In  view  of  the  fact  that  one  of  the  hypotheses  regarding  the 
validity  of  tests  was  concerned  with  school  examinations  used  in 
Edmonton,  it  was  important  to  test  this  hypothesis  by  an  analysis  upon 
a  representative  sample.  In  this  analysis  the  total  scores  for  the 
STEP  Science  test  and  the  Test  of  Understanding  Science  were  used 
because  the  total  tests  would  have  higher  reliability  and  thus  the 
analysis  would  be  freer  of  error  variance. 

To  test  hypothesis  5  an  analysis  of  six  subtests  of  STEP  Science 
was  performed. 

Another  minor  analysis  was  one  used  to  throw  light  upon  the 
postulate  that,  at  the  high  school  level,  performance  on  tests  includ¬ 
ing  items  from  different  sciences  such  as  biology,  chemistry,  and 
physics  is  determined  by  general  intellectual  ability  rather  than 
abilities  specific  to  the  different  sciences. 

A  detailed  account  of  all  these  analyses  is  given  in  Chapter  VI 
and  the  conclusions  drawn  from  the  investigation  and  certain  implica¬ 
tions  are  set  out  in  the  concluding  chapters. 
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CHAPTER  V 


TEST  CONSTRUCTION 

A  Test  of  Application  of  Scientific  Knowledge 

As  has  been  indicated  in  the  previous  chapter,  this  test 
was  designed  to  sample  some  of  the  same  hypothetical  variables  as 
those  measured  by  STEP  Science.  It  was  intended  that  this  test  should 
provide  a  third  measure  of  problem-solving  ability,  that  is,  along 
with  STEP  Science  Parts  I  and  II. 

The  abilities  reported  by  the  STEP  test  constructors  were  listed 
and  attention  was  concentrated  on  three  which  seemed  most  closely 
related  to  three  of  Guilford’s  phases  in  problem -solving,  "  the  analysis 
of  the  difficulty,  "  "production,  "  and  "verification.  "  The  three  which 
seemed  to  be  most  relevant  were  ability  to  define  problems,  ability 
to  suggest  and  screen  hypotheses,  and  ability  to  draw  conclusions. 

An  analysis  was  made  of  the  grade  IX  science  and  Science  X  courses, 

) 

which  the  subjects  had  followed,  and  a  list  of  topics  was  made.  Exten¬ 
sive  reading  of  periodicals  and  books  was  then  undertaken  to  find 
information  about  new  developments  in  science  which  were  related  to 
the  topics  listed,  but  which  required  an  application  of  knowledge  of 
facts  and  principles  from  those  topics  to  new  situations.  When  suffi¬ 
cient  information  was  assembled  the  task  of  item  writing  was  under¬ 
taken.  The  format  was  made  similar  to  the  STEP  test.  The  items 
were  of  the  multiple  choice  variety  with  four  alternative  responses, 
and  the  items  were  arranged  in  groups,  each  group  being  related  to 
a  certain  situation.  The  instructions  were  also  similar.  Fifty  items 
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were  assembled  into  two  tests  of  twenty  five  each.  The  items  were  dis¬ 
cussed  with  science  teachers  at  the  school  and  university  levels,  and 
with  a  professor  experienced  in  test  construction,  and  subsequently 
revised.  The  two  tests  were  then  administered  to  two  Science  X 
classes  which  were  not  part  of  the  main  testing  project.  There  were 
fifty  eight  students  in  all.  The  tests  were  also  discussed  with  these 
students.  Their  reactions  were  generally  favorable;  some  minor 
misunderstandings  were  discovered. 

An  item  analysis  was  carried  out  on  the  scores  for  the  students. 
The  nomographs  of  Colver  (1959)  were  used  to  determine  the  item 
indices.  On  the  basis  of  this  analysis  twenty  five  items  were  selected 
using  "item  difficulty"  and  "item  validity"  as  criteria.  As  a  guide, 
Garrett's  (1958)  suggestions  of  approximately  fifty  per  cent  as  item 
difficulty,  and  item  discrimination  indices  above  0.  2  were  employed. 
Some  reorganization  and  rewording  of  the  stems  were  necessary  to 
assemble  the  twenty  five  chosen  items  into  the  final  test.  Such  editing 
was  kept  to  a  minimum,  and  apart  from  changing  a  few  words  which  had 
caused  misunderstandings,  the  responses  to  the  items  were  left  intact. 

Due  to  an  accident  in  the  preparation  of  the  test  booklets  just 
prior  to  testing,  the  first  four  items  were  not  used.  The  explanation 
was  given  to  all  classes  in  the  project  so  that  they  were  able  to  com¬ 
mence  at  item  5,  and  all  classes  were  treated  alike. 

The  scores  of  the  students  in  the  main  project  were  used  as 
the  basis  for  a  further  item  analysis.  The  values  for  item  difficulty 
and  for  the  discrimination  index  are  shown  in  Table  IX.  The  values 
for  the  first  four  items  are  those  which  were  obtained  on  the  original 
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TABLE  II 

ITEM  INDICES  FOR  THE  TEST  OF  APPLICATION 
OF  SCIENTIFIC  KNOWLEDGE 


(N  =  271) 

Item  Number 

Difficulty 

Index 

Dis  crimination 

Index 

*  1. 

.  95 

.  40 

*  2. 

.  86 

.  28 

*  3. 

.  76 

.  28 

*  4. 

.  65 

.  40 

5. 

.  53 

.  58 

6. 

.  49 

.  40 

7. 

.  60 

.  59 

8. 

.  63 

.  36 

9. 

.  79 

.  50 

10. 

.  16 

.  40 

11. 

.  35 

.  32 

12. 

.  41 

.  03 

13. 

.  34 

.  44 

14. 

.  48 

.  38 

15. 

.  23 

.  26 

16. 

.  85 

.  63 

17. 

.  51 

.  55 

18. 

.  47 

.  41 

19. 

.  54 

.  49 

20. 

.  52 

.32 

21. 

.  21 

.  38 

22. 

.  31 

.  28 

23. 

.  31 

.  42 

24. 

.  40 

.  40 

25. 

.  29 

.  24 

*Th.e  values  for  these  items  are  those  obtained  on  a  prelimi¬ 
nary  item  analysis. 
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item  analysis.  It  can  be  seen  that  item  1Z  is  unsatisfactory  in  terms  of 
its  discriminatory  power.  It  was  not  used  to  obtain  the  scores  which 
were  included  in  the  factor  analyses.  The  final  scores  for  the  students 
were  then  based  on  twenty  items. 

Using  the  responses  by  students  in  the  main  project  to  the  twenty 
items,  the  reliability  of  the  test  was  estimated  by  the  split-half  method 
using  the  Spearman-Brown  correction.  The  value  obtained  was  0.  48. 

It  was  hoped  that  the  reliability  would  be  higher  than  this,  but  for 
twenty  items  it  is  not  surprising.  On  this  basis,  a  test  of  sixty  items 
similar  to  those  used  in  the  test  under  discussion  could  be  expected  to 
have  a  reliability  of  about  0.  75.  The  STEP  test  of  sixty  items  has  a 
reliability  of  0.81  as  estimated  by  the  Kuder-Richardson  Formula  20. 

A  value  of  0.  48  would  not  be  good  for  predictive  purposes,  and  it  is 
desirable  that  tests  used  in  factor  analyses  should  have  high  reliability. 
However,  the  main  purpose  for  which  this  test  was  constructed  was  to 
provide  another  variable  similar  to  the  STEP  test  to  confirm  any  factor 
which  may  be  defined  by  the  two  parts  of  the  latter  test,  for  example 
an  unique  problem-solving  ability  factor.  For  this  purpose  it  was 
probably  adequate. 

A  copy  of  the  test  is  included  as  Appendix  A  to  the  thesis. 

Scale  to  Measure  Attitude  to  Investigation  and  Discovery  of  Knowledge 

This  scale  was  constructed  in  an  attempt  to  provide  an  instru¬ 
ment  which  would  measure  curiosity.  As  previously  mentioned,  no 
such  instrument  could  be  found  in  the  literature  though  two  authors 
are  collaborating  on  the  construction  of  a  test  for  use  with  elementary 


school  children. 


■ 


80 


An  investigation  of  references  to  curiosity  was  made  to  deter¬ 
mine  those  characteristics  which  are  believed  to  mark  human  curiosity. 
The  work  of  Berlyne  (I960)  has  been  mentioned  in  Chapter  II.  Krech 
and  Crutchfield  (1958)  described  curiosity  as  an  interest  in  novel  things 
and  a  "reaching  out  to  encompass  more  and  more  of  the  world.  "  (1958, 

p.  287).  They  said  that  exploratory  behavior  can  sometimes  be  explained 
in  terms  of  drives  for  food  or  for  safety,  but  that  there  is  a  curiosity 
drive  in  addition.  This  drive  may  be  inhibited  by  lack  of  opportunity, 
or  by  punishment  when  it  is  expressed.  Young  (1961)  regarded  curiosity 
as  a  "wish  for  new  experience"  and  said,  "scientific  curiosity  has  been 
described  as  a  desire  to  know,  to  understand.  "  (1961,  p.  55).  Woodworth 
(1958)  noted  that  curiosity  is  evident  in  children  in  their  exploratory 
behavior.  They  investigate  their  environment  and  manipulate  objects  to 
see  what  they  will  and  will  not  do.  He  said  that  curiosity  is  stimulated 
by  questions  to  which  the  answers  are  unknown.  Maw  and  Maw  (1961) 
decided  that  an  elementary  school  child  exhibited  curiosity 
to  the  extent  that  he: 

1.  reacts  positively  to  new,  strange,  incongruous  or  mysteri¬ 
ous  elements  in  his  environment  by  moving  toward  them, 
by  exploring  them  or  by  manipulating  them 

2.  exhibits  a  need  or  desire  to  know  more  about  himself  and/ 
or  his  environment 

3.  scans  his  surroundings  seeking  new  experiences 

4.  persists  in  examining  and  exploring  stimuli  in  order  to 
know  more  about  them.  (1961,  p.  297). 

The  survey  of  such  references  led  to  the  decision  to  construct  a  scale 

to  measure  the  attitude  of  students  to  investigation  and  discovery  of 

knowledge.  It  was  believed  that  a  favorable  attitude  along  this 
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dimension  would  be  desirable  for  scientific  work,  and  a  desirable 
response  for  persons  generally.  Further,  it  was  thought  that  such  a 
favorable  attitude  probably  would  be  related  to  the  display  of  curiosity. 
Four  aspects  of  this  attitude  were  considered;  the  desire  to  discover 
new  things  and  the  effect  of  opposition  on  this,  reactions  to  strange 
situations,  the  view  of  knowledge  held,  and  motives  for  investigation. 

The  evidence  of  Jones  (I960)  and  of  Edwards  (1957)  indicates 
that  the  successive  interval  method  of  scale  construction  is  superior 
to  the  category  methods  such  as  the  equal -appearing  interval  tech¬ 
nique.  Thus  the  former  seemed  to  be  the  best  one  to  use  especially 
as  paired- comparison  methods,  which  are  probably  the  best,  are  not 
suitable  for  the  present  study  because  they  give  ipsative  rather  than 
normative  scores.  The  demands  of  the  testing  program  made  it 
impossible  to  carry  out  the  calculations  needed  by  the  successive 
interval  technique  for  the  selection  of  items  for  the  scale.  Since  the 
method  of  ratings  by  the  judges  is  the  same  for  both  the  successive 
interval  and  the  equal-appearing  interval  methods,  it  was  decided  to 
select  the  items  on  the  basis  of  equal-appearing  interval  scale- values, 
and  later  to  calculate  the  successive  interval  values  before  scoring 
the  responses  of  students  in  the  main  project.  The  method  outlined  by 
Edwards  ( 1  957)provided  the  general  basis  for  the  construction  of  the 
scale.  (In  this  discussion  further  references  to  that  particular  text 
by  Edwards  will  be  made  without  the  inclusion  of  the  date  of  publica¬ 
tion.  ) 

In  order  to  collect  statements  which  would  represent  the 
opinions  of  grade  X  students,  a  short  questionnaire  was  drawn  up 
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based  upon  the  four  aspects  mentioned.  Covering  the  desire  to  dis¬ 
cover  new  things  were  questions  such  as  "Does  your  curiosity  drive 
you  to  seek  answers  even  if  your  questions  are  sometimes  unpopular?" 
The  question  "Is  the  exploration  of  new  places  and  the  investigation  of 
curious  objects  of  interest  to  you,  or  are  you  more  at  home  with 
familiar  places  and  familiar  objects?"  was  concerned  with  the 
reaction  to  strange  situations.  Questions  related  to  the  view  of  know¬ 
ledge,  and  to  motives  for  investigation  are  exemplified  respectively, 
by  "Do  you  think  too  much  knowledge  can  be  a  bad  thing?"  and  "Should 
we  seek  knowledge  for  practical  purposes  only  (for  example,  to  get  a 
job  we  will  like,  or  to  discover  new,  useful  drugs),  or  is  it  good  to 
seek  knowledge  just  for  its  own  sake?"  The  questionnaire  was 
answered  by  students  in  two  Science  X  classes  which  were  not  part 
of  the  main  testing  program.  They  were  encouraged  to  write  complete 
sentences  as  answers.  Some  of  the  students  also  contributed  further 
to  the  project  by  writing  short  essays  on  the  subject.  The  responses 
of  some  students  were  not  helpful  as  a  source  of  opinion  statements, 
but  others  were  very  fruitful.  Out  of  these,  a  group  of  seventy  nine 
statements  was  drawn  up  to  represent  shades  of  opinion  from  the  very 
favorable  to  the  very  unfavorable. 

In  order  to  produce  the  scale  from  the  initial  set  of  state¬ 
ments,  it  was  decided  to  have  them  judged  for  degree  of  favorability, 
by  adult  judges,  and  to  try  out  a  preliminary  version  of  the  scale  on 
a  group  of  students.  The  latter  procedure  would  serve  to  check  that 
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the  understanding  of  a  statement  held  by  the  adults  was  also  the  typical 
student  understanding  of  the  statement.  Edwards  gives  evidence  that 
reliable  scale  values  can  be  obtained  with  as  few  as  fifteen  judges.  In 
this  investigation  twenty  five  judges  were  used.  These  included  pro¬ 
fessors,  teachers,  graduate  students,  and  the  wives  of  some  of  these. 
Each  one  of  the  judges  was  given  a  set  of  seventy  nine  statements, 
each  statement  being  recorded  on  a  separate  slip  of  paper.  The 
judges  were  asked  to  rate  these  on  an  1  1  point  scale  from  most  favor¬ 
able  to  least  favorable  using  instructions  based  on  those  used  by 
Thurstone  and  Chave  (1929).  The  ratings  by  one  of  the  judges  were 
rejected  because  twenty  six  of  the  seventy  nine  statements  had  been 
allocated  to  category  I,  considerably  in  excess  of  the  allotment  by  any 
other  judge.  This  rejection  was  in  line  with  the  criterion  used  by 
Thurstone  and  Chave  for  eliminating  judges  who  misunderstood  the 
directions  or  were  careless  in  the  task. 

From  the  distribution  of  the  ratings  for  each  statement  the 
median  was  found  giving  the  scale  value,  and  the  Q  value  was  obtained 
graphically  as  a  measure  of  the  range  of  the  distribution.  Forty 
statements  were  selected  with  the  criteria  of  low  Q-value  and  scale 
values  representing  all  points  of  the  scale  proportionately. 

The  scale  was  then  tried  out  on  a  Science  X  class  not  partici¬ 
pating  in  the  major  project,  and  the  responses  of  the  students  were 
used  for  an  item  analysis.  From  this  analysis  twenty  one  statements, 
as  suggested  by  Edwards,  were  selected.  They  were  selected  to 
represent  all  points  of  the  scale  approximately  equally.  In  addition, 
satisfactory  discrimination,  using  the  criterion  cited  previously 
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namely  an  index  greater  than  0.  2,  was  required  for  the  items  away  from 
the  middle  of  the  scale.  For  the  neutral  items  it  seemed  that  probably 
a  low  absolute  value  of  the  discrimination  index  was  the  goal  to  be 
aimed  at.  In  order  to  have  sufficient  items  with  scale  values  just  below 
the  middle  of  the  range,  it  was  necessary  to  use  two  items  from  the 
original  seventy  nine,  but  items  which  had  not  been  included  in  the 
forty  used  for  the  trial  test.  Thus,  discrimination  indices  were  not 
available  for  these,  but  their  Q-values  were  not  greater  than  the  Q  - 
values  of  the  other  statements. 

The  statement  "My  curiosity  drives  me  to  seek  answers  even 
when  my  questions  are  unpopular"  is  an  example  of  an  item  related  to 
the  desire  to  discover  new  things,  whilst  "Too  much  knowledge  can 
drive  a  person  crazy"  concerned  the  view  of  knowledge  held  by  the 
subject. 

On  the  basis  of  the  judges  ratings  for  the  twenty  one  items 
selected,  successive  interval  values  were  calculated  using  the  method 
outlined  by  Edwards.  The  internal  consistency  test  suggested  by  that 
author  was  also  made,  and  an  average  error  of  0.  026  was  found  for 
the  successive  interval  values.  This  compares  well  with  the  typical 
values  given  by  Edwards.  However,  when  the  successive  interval 
scale  values  were  plotted  against  the  equal-appearing  interval  values 
the  graph  was  so  close  to  a  straight  line  that  it  did  not  seem  possible 
to  conclude,  in  this  case,  that  one  set  of  values  was  better  than  the 
other.  It  was  thought  that  possibly  the  number  of  judges  used  here 
was  not  sufficient  to  meet  the  requirement  of  normality  of  the  distri¬ 
bution  of  ratings  for  each  item,  which  is  assumed  in  the  successive 
interval  technique.  In  view  of  the  evidence,  the  simpler  equal-appearing 
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interval  values  were  used. 

The  scale  values,  Q-values,  and  discrimination  indices  for  the 
statements  are  given  in  Table  III  and  the  scale  is  included  in  the  thesis 
as  Appendix  B. 

An  attempt  was  made  to  provide  evidence  of  the  validity  of  the 
scale  by  comparing  its  rankings  for  a  group  of  students  with  the  rank¬ 
ings  for  the  same  students  by  teachers.  The  teachers  of  three  of  the 
nine  classes  in  the  project  were  chosen  and  asked  to  select  a  small 
group  (say  about  six)  of  students  from  their  classes,  who  clearly  dis¬ 
played  high  curiosity  and  a  small  group  characterized  by  low  curiosity. 
They  were  given  criteria  to  serve  as  guides  and  to  attempt  to  achieve 
comparability  between  the  ratqrs.  These  criteria  included  the  extent 
to  which  students  were  active  in  asking  questions  and  seeking  knowledge 
generally.  They  were  asked  to  try  to  dissociate  curiosity  from  intelli¬ 
gence  in  their  thinking  about  their  students,  and  it  was  pointed  out  that 
curiosity  tends  to  be  more  general  in  its  scope  than  interest.  For 
example,  there  is  a  tendency  for  interest  to  be  directed  to  one  subject 
or  hobby.  One  of  these  teachers  also  selected  high  and  low  groups  in 
each  of  two  other  Science  X  classes.  The  investigator  met  these  classes 
for  a  short  period  on  one  occasion  to  administer  the  scale.  'Periling  of 
the  ratings  for  all  classes  produced  a  high  group  of  27  students  and  a 
low  group  of  22  students.  The  comparisons  of  the  means  arid  standard 
deviations  of  these  groups  are  summarized  in  Table  IV.  The  F-test  as 
suggested  by  Walker  and  Lev  (1953),  was  used  to  check  the  hypothesis 
that  the  variances  of  the  two  groups  were  not  significantly  different, 
this  being  an  assumption  upon  which  the  t-test  depends.  The  t-test  was 
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TABLE  III 

SCALE  VALUES,  Q- VALUES  AND  DISCRIMINATION  INDICES  FOR 
THE  SCALE  TO  MEASURE  ATTITUDE  TO  INVESTIGATION 
AND  DISCOVERY  OF  KNOWLEDGE 


Statement 

Number 

Scale 

Value 

Q~  Value 

Discrimination 

Index 

1 

10.  3 

2.  6 

.  60 

2 

0.  9 

1.  9 

.  26 

3 

1.  0 

1.  4 

,  20 

4 

7.  9 

1.  0 

.  65 

5 

5.  7 

1.  6 

.  20 

6 

8.  9 

1.  2 

,  70 

7 

3.  0 

1.  4 

.  45 

8 

1.  4 

1.  1 

.  40 

9 

9.  0 

2.  0 

.  70 

10 

4.  6 

2.  4 

11 

2.  4 

0.  8 

.  50 

12 

8.  7 

1.  4 

.  44 

13 

6.  1 

1.  6 

.  20 

14 

6.  8 

1.  6 

.  34 

15 

0.  6 

0.  8 

.  60 

16 

5.  6 

1.  5 

.  55 

17 

3.  2 

1.  5 

.  60 

18 

7.  2 

1.  7 

.  22 

19 

7.  5 

2.  3 

,  43 

20 

3.  5 

1.  8 

->!< 

21 

10.  1 

3.  0 

.  60 

*These  statements  were  not  used  in  the  preliminary  verson  of  the 
scale. 
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TABLE  IV 

COMPARISON  BETWEEN  SCALE  SCORES  OF  GROUPS 
RATED  HIGH  AND  LOW  ON  CURIOSITY 


Variable 

High  Curiosity 
Group 

Low  Curiosity 
Group 

Test  of  Signi¬ 
ficance  for 
Difference 

Significance 

of 

Difference 

Total  High  and  Low  Group 

s  (5  classes) 

N 

27 

22 

- 

- 

Standard 

Deviation 

1.  63 

1.  28 

F  =  1.81 

N.  S. 

Mean 

3.  8  3 

4.  61 

t  =1.86 

.  05 

High  and  Low  Groups  from  3  classes 


N 

19 

12 

“ 

- 

Standard 

Deviation 

1.  56 

1.  15 

*1 

ii 

i — * 

oo 

O'. 

N.  S. 

Mean 

3.  41 

4.  86 

t  =  3.  32 

.  01 
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used  to  compare  the  means  of  the  two  groups,  a  one-tailed  test  being  used 
because  the  direction  of  the  difference  between  the  means  was  hypothesized 
in  advance. 

It  can  be  seen  from  the  table  that  the  difference  between  the  means 
of  the  total  validation  groups  was  significant  at  the  0.  05  level.  However, 
if  the  three  classes  which  were  part  of  the  main  testing  project  were  con¬ 
sidered  alone,  the  difference  between  the  means  of  the  high  and  low  groups 
from  this  sample  were  significant  at  the  0.  01  level.  There  was  no  signifi¬ 
cant  difference  between  the  means  of  the  high  and  low  groups  obtained  from 
the  two  classes  which  did  not  participate  in  the  main  project.  The  reason 
for  this  difference  is  not  clear.  However,  two  possible  explanations  sug¬ 
gest  themselves.  Firstly,  the  three  classes  in  the  main  project  had  worked 
with  the  investigator  over  six  class  periods  before  they  responded  to-the 
attitude  scale.  Contact  with  the  other  two  classes  was  confined  to-  a  short 
period  on  one  occasion  for  administration  of  the  scale.  It  could  be  that 
differences  in  the  degree  of  confidence  of  the  subjects  in  the  investigator 
caused  a  difference  in  the  degree  of  cooperation.  Secondly,  there"  is  evi¬ 
dence  that  the  reliability  of  ratings  is  quite  low  as  Cronbach  (I960)  has 
shown.  In  this  regard  it  will  be  noted  that  the  two  extra  classes  were-  rated 
by  one  of  the  judges  concerned  with  the  other  three  classes.  Itrmay  be  that 
the  reliability  of  his  ratings  differs  considerably  from  that  of  the  other  two 
judges.  On  the  other  hand,  in  so  far  as  the  unreliability  of  judges'  ratings 
is  due  in  part  to  their  inadequate  knowledge  of  the  subjects,  it  could  be 
pointed  out  that  the  teachers  in  the  present  study  had  worked  with  the 
students  continuously  for  about  nine  months  and  were  thus  in  a  very  favor¬ 
able  position  to  give  satisfactory  ratings. 


CHAPTER  VI 


ANALYSIS  OF  THE  DATA 

This  chapter  is  chiefly  concerned  with  the  factor  analyses  which 
were  carried  out  in  order  to  test  the  hypotheses  set  out  in  Chapter  III. 

An  account  will  be  given  first  of  the  particular  method  of  factor  analysis 
and  of  rotation  used,  and  then  the  results  will  be  presented  and  interpreted. 

The  Principal-Factor  Method  of  Analysis 

The  technique  of  analysis  used  in  this  investigation  is  the  principal- 
factor  method.  This  has  some  properties  similar  to  the  centroid  solution; 
in  fact  the  latter  is  an  approximation  to  the  principal-factor  solution.  When 
computer  facilities  are  available  the  principal-factor  technique  is  likely  to 
be  preferred  because  it  gives  an  unique  mathematical  solution  for  a  given 
correlation  matrix,  and  because  it  takes  out  the  maximum  possible  amount 
of  variance  with  a  given  number  of  factors. 

When  the  scores  of  persons  on  two  positively  correlated  tests  are 
plotted  using  scales  representing  the  range  of  test  scores  as  axes,  an 
elliptical  distribution  of  points  is  obtained,  the  familiar  "scattergram.  " 

If  the  scores  on  three  variables  are  plotted  using  the  corresponding  axes 
an  ellipsoidal  distribution  results.  The  procedure  can  be  extended 
analytically  to  more  than  three  dimensions,  and  the  distribution  is  still 
referred  to  as  ellipsoidal.  If  there  are  n  tests  this  distribution  is  made 
up  of  N  points  (one  for  each  person)  in  an  n- space  (one  dimension  for  each 
test).  Actually,  the  loci  of  points  of  uniform  frequency  density  produce  a 
set  of  more  or  less  similar  and  similarly  placed  n  dimensional  ellip¬ 
soids.  The  corresponding  principal  axes  of  these  ellipsoids  are  in  the 
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same  directions  and  it  is  these  directions  which  are  used  as  reference 
axes.  When  a  transformation  is  carried  out  so  that  the  n  tests  are  con¬ 
sidered  in  an  N-space  (one  dimension  corresponding  to  each  person),  the 
test  vectors  take  up  such  positions  that,  when  they  are  adjusted  to  unit 
length,  the  cosines  of  the  angles  between  them  are  equal  to  the  correla¬ 
tion  coefficients.  The  factor  loadings  are  given  by  the  perpendicular 
projections  of  the  test  vectors  on  the  reference  axes. 

In  algebraic  terms,  as  outlined  by  Thur stone  (1948),  the  above 
geometric  picture  corresponds  to  finding  a  set  of  principal-factor s-such 
that  the  first  one  accounts  for  the  maximum  amount  of  the  variance  which 
is  possible  for  any  position  of  an  axis  in  the  test  space;  the  second  factor 
accounts  for  the  maximum  possible  amount  of  the  residual  variance,  and 
similarly  for  each  succeeding  factor  till  all  the  common  variance  has  been 
accounted  for.  This  is  equivalent  to  choosing  the  factor  loadings  a^ 

( j  =  1 ,  2,  .  .  .  ,  n)  so  as  to  make  the  residuals  a  minimum. 

Since 


rjk~ajlakl  +  aj2ak2  +•  •  •  +  ajmakm  +  ajak 


where 

j,  k  represent  two  tests, 

the  residual  (r jp.  -  a-jia^)^'  must  be  made  a  minimum  for  all  j  and  k. 


A  least  squares  solution  for  this  is  given  by  the  solutions  of  the  equation. 


where 


R  is  the  correlation  matrix  of  order  (n  x  n) 

X  is  a  scalar  (1  x  1).  For  a  given  principal-factor  p 


I  is  an  identity  matrix  of  order  (n  x  n). 


a  is  an  (n  x  1)  column  vector. 
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Harman  (I960)  pointed  out  that  the  necessary  and  sufficient  condition  for 
the  n  linear  equations  represented  by  this  equation  to  have  a  nontrivial 
solution  is  that  the  following  determinant  should  be  zero: 

|r  -  X  l|=  0 

This  last  equation  is  called  a  characteristic  equation.  The  solution  of 
the  characteristic  equation  leads  to  n  values  of  A  ““the  eigenvalues  or 
latent  roots.  When  one  of  these  eigenvalues  \  p  is  substituted  in 
(R  -  X  I)  a  =  0,  a  solution  in  the  form  of  a  column  vector  (©(^p*  ©C  2p,  .  .  .  >^np) 
is  obtained- -the  eigenvector  or  latent  vector.  There  are  n  such  vectors  in 
all,  one  for  each  X  .  The  eigenvector  is  converted  to  a  normalized  eigen¬ 
vector  p  by  dividing  each  element  by  the  square  root  of  the  sum  of  the 
squares  of  all  the  elements. 

pp  =  _ ^jp  _  »  (j  =  1,  .  .  .  ,  n). 

y^Tp^  +  ^  Zp^  +*  •  •  ^^np 

It  can  be  shown  that  the  factors  for  the  original  correlation  matrix  are 
then  given  by  the  product:  of  each  normalized  eigenvector  and  the  square 
root  of  the  corresponding  eigenvalue 


The  principal-factor  method  gives  an  unique  mathematical  solution 
for  the  given  correlation  matrix.  By  virtue  of  the  nature  of  the  method 
the  first  factor  produced  is  a  general  factor  and  rotation  to  psychological 
meaning  is  performed  on  all  the  common  factors. 

Empirical  evidence  has  been  provided  by  C.  Wrigely  (1958)  that 
for  large  matrices  the  factorial  solution  is  little  different  when  "commun- 
alities"  or  unities  are  used  in  the  diagonal  cells  of  the  matrix  to  be 
factored.  When  unities  are  employed  the  number  of  common  factors 
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equals  the  number  of  tests.  However,  in  general  a  relatively  small  number 
of  these  factors  account  for  most  of  the  variance.  As  a  guide,  Kaiser 
(I960)  has  suggested  that  all  those  factors  should  be  considered  as  signi¬ 
ficant  for  which  \^1.  Wrigley  (1959)  has  provided  empirical  evidence 
that  the  best  approximation  to  the  communality  is  the  squared  multiple 
correlation  of  each  variable  with  the  remaining  ones.  With  small 
matrices  it  is  best  to  use  these  squared  multiple  correlations  in  the 
diagonal  cells.  In  this  case  all  the  factors  for  whichXlO,  may  be 
considered,  but  Harman  (I960)  has  argued  that  only  a  number  of  factors 
necessary  to  account  for  the  starting  communality  is  warranted.  -This 
number  may  be  less  than  the  number  for  which 0  . 

This  method  of  analysis  only  becomes  practicable  for  large 
matrices  when  computer  facilities  are  available.  In  this  investigation 
the  analyses  were  carried  out  with  the  assistance  of  the  IBM  1620  machine 
of  the  Computing  Centre  at  the  University  of  Alberta.  The  program  used 
was  based  upon  the  Jacobi  method  of  finding  the  eigenvalues  and  eigen¬ 
vectors  for  real  symmetric  matrices.  In  this  method  the  off -diagonal 
elements  of  the  matrix  R  are  reduced  to  zero  one  at  each  state  of  suc¬ 
cessive  orthogonal  transformations  with  the  aid  of  a  matrix  B  of  special 
form  involving  sines  and  cosines.  The  process  continues  till  a-ll'the  off- 
diagonal  elements  are  within  a  given  tolerance  of  zero.  During  this 
process  R  is  modified  to  a  diagonal  matrix  D,  and  B  is  changed  so  that 

D  =  B  R  BT 

The  eigenvalues  of  R  are  then  given  by  the  diagonal  elements  of  D,  and 
the  eigenvectors  are  given  by  the  rows  of  the  final  matrix  B. 
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The  Varimax  Analytical  Rotation 

Harman  (I960)  has  traced  the  development  of  objective  methods  of 
rotation.  Though  Thur stone's  simple  structure  principles  embodied  an 
attempt  to  achieve  objectivity,  in  practice,  rotation  on  this  basis  was 
still  more  of  an  art  than  a  science.  This  fact  led  to  attempts  to  provide 
a  mathematical  basis  for  rotation. 

Harman  (I960)  summarized  solutions  presented  by  Neuhaus  and 
Wrigely,  Carroll,  Saunders,  Ferguson,  and  by  Kaiser.  He  suggested 
that  the  fundamental  basis  for  all  these  is  to  achieve  parsimony  in  the 
solution,  this  being  a  more  basic  principle  than  simple  structure.  The 
measure  of  parsimony  can  be  thought  of  as  "the  position  of  the  configura¬ 
tion  on  an  hypothetical  continuum  of  all  possible  configurations,  from  the 
completely  chaotic  to  the  ideal  configuration  in  which  each  variable  is  of 
unit  complexity."  (Harman,  I960,  p.  292). 

The  method  used  in  the  present  study  is  that  of  Kaiser  (1959) 
called  the  "normal  varimax"  solution.  By  empirical  investigation,  this 
method  has  been  shown  to  agree  well  with  intuitive  graphical  procedures 
based  oh  the  simple  structure  principles.  Also,  it  is  claimed  that  the 
varimax  rotation  has  the  property  of  invariance,  so  that  the  factor 
structure  found  for  a  test  in  one  battery  will  remain  the  same  when  that 
test  is  moved  to  another  battery  if  the  varimax  rotation  is  used  on  this 
new  analysis  as  well. 

In  the  varimax  method  emphasis  is  placed  upon  simplification  of 


the  columns  of  the  factor  matrix  by  maximizing  The  variance  of  the 
squared  loadings  of  the  factors.  The  components  of  a  factor  then  tend 
to  unity  or  zero.  This  involves  finding  factor  loadings  such  as  to 


to 
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maximize  the  function 
m  n 


m  n 


where 


ajp  is  the  factor  loading  for  the  factor  p,  on  test  j 
hj^  is  the  communality  for  the  test  j. 

This  process  is  achieved  with  the  aid  of  computers. 


The  Analysis  of  Sub-Tests  and  Other  Variables 

The  Principal- Factor  Pattern.  The  mathematical  model  for  factor 
analysis  makes  no  assumptions  about  the  normality  of  the  distributions 
of  the  scores  on  the  tests,  thus  the  raw  scores  were  used.  Actually  the 
only  variable  for  which  the  distribution  was  obviously  astray  from  the 
normal  distribution  was  the  attitude  scale  IHA.  This  was  badly  skewed 
and  truncated.  It  could  be  predicted  that  the  distribution  of  superstitious 
belief  in  the  high  school  population  would  be  normal.  However,  the 
analysis  involving  all  the  sub-tests  was  largely  intended  for  exploratory 
purposes  and  it  was  decided  to  retain  the  scale  IIIA  for  this  purpose. 


The  STEP  Science  scores  were  not  changed  into  converted  scores 


as  suggested  in  the  manual  for  use  in  comparing  students'  results.  In 
the  present  study  the  concern  was  with  the  nature  of  the  test  itself  and 
any  conversion  factor  involving  addition  or  multiplication  of  the  raw 
scores  by  a  constant  would  not  change  the  correlation  of  this  test  with 
others. 


Starting  from  the  raw  scores  of  the  185  students  for  whom  a  full 


set  of  data  was  available,  the  twenty  four  variables  were  inter  cor  related 
by  means  of  the  IBM  1620  computer  using  a  program  for  Pearson  product- 
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moment  coefficients.  It  is  important  that  the  correlation  coefficients  used 
in  a  factor-analysis  should  be  all  product-moment  coefficients.  Point 
biserial  coefficients  were  calculated  for  the  correlation  of  sex  with  the 
other  variables.  According  to  Walker  and  Lev  (1953)  the  point  biserial 
coefficient  is  a  product-moment  correlation. 

For  convenience  in  reference  the  following  is  a  list  of  the  twenty 
four  variables  used  in  the  first  analysis  showing  the  full  title  of  each  and 
the  abbreviation  used  in  the  tables  to  refer  to  it.  The  abbreviation  is 
given  in  parentheses  following  the  name  of  the  test. 

1.  Sequential  Test  of  Educational  Progress,  Science,  Form  2A  Part  I 
(STEP  Sc.  I) 

2.  Sequential  Test  of  Educational  Progress,  Science,  Form  2A  Part  II 
(STEP  Sc,  II) 

3.  A  Test  of  Application  of  Scientific  Knowledge  (TASK) 

4.  Departmental  Examinations,  1961,  Grade  IX,  Science.  The  High 
School  Entrance  Examination  Board,  Alberta  Department  of  Educa¬ 
tion,  Section  A  (Science  9A) 

5.  Departmental  Examinations,  1961,  Grade  IX,  Science.  Section  B 
(Science  9B) 

6.  Science  X  School  Grades  1962,  for  three  Edmonton  high  schools. 
(Science  10) 

7.  Test  of  Understanding  Science  (TOUS)  Form  W.  Scale  I  (TOUS  I) 

8.  Test  of  Understanding  Science  (TOUS)  Form  W.  Scale  II  (TOUS  II) 

9.  Test  of  Understanding  Science  (TOUS)  Form  W.  Scale  III  (TOUS  III) 

10.  Test  of  Number  Facility.  Addition  Test  (Number  I) 


11. 


Test  of  Number  Facility.  Subtraction  and  Multiplication  (Number  II) 


, 
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12.  Test  of  Verbal  Ability:  Vocabulary.  (Verbal) 

13.  Holzinger- Crowder  Uni-Factor  Tests:  Test  3  (Boots).  (Spatial  I) 

14.  Holzinger- Crowder  Uni-Factor  Tests:  Test  4  (Hatchets).  (Spatial  II) 

15.  Standard  Progressive  Matrices,  Sets,  A,  B,  C,  D,  and  E,  prepared 
by  J.  C.  Raven  (R.  Matrices) 

16.  General  Reasoning:  Mathematics  Aptitude  Test.  (Gen.  Reasoning) 

17.  Holzinger- Crowder  Uni-Factor  Tests:  Test  9  (Teams).  (H-C. 
Syllogism) 

18.  Cooperative  School  and  College  Ability  Tests.  Quantitative  (SCAT 
Quant.  ) 

19.  Cooperative  School  and  College  Ability  Tests,  Verbal  (SCAT  Verbal) 

20.  Rokeaxh's  Dogmatism  Scale.  (Sc.  Att.  I) 

21.  Scale  to  Measure  Attitude  to  Investigation  and  Discovery  of  Know¬ 
ledge  (Sc,  Att.  II) 

22.  Mailer  and  Luncjeen  Superstition  Test  (Sc.  Att.  I II A)  scored  for  belief 
in  superstition 

23.  Mailer  and  Lundeen  Superstition  Test  scored  for  influence  by  super¬ 
stitions  (Sc.  Att.  IIIB) 

24.  Sex 

The  possible  scores,  means,  and  standard  deviations  of  the  vari¬ 
ables  are  listed  in  Table  V  and  the  correlation  coefficients  are  set  out  in 
Table  VI. 

The  correlation  coefficients  were  then  subjected  to  factor  analysis 
using  unities  in  the  diagonal  cells  of  the  matrix.  The  computation  was 
carried  out  by  the  Jacobi  method  of  analysis  on  the  IBM  1620  computer. 

Six  eigenvalues  greater  than  one  were  obtained.  Two  other  values  were 


. 
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TABLE  Y 

POSSIBLE  SCORES,  MEANS,  AND  STANDARD 
DEVIATIONS  FOR  TWENTY  FOUR  VARIABLES 


N  =  185 

Variable 

Possible 

Score 

Mean 

Standard 

Deviation 

1. 

STEP  Sc.  I 

30 

18.  68 

3.  88 

2. 

STEP  Sc.  II 

30 

16.  62 

4.  18 

3. 

TASK 

20 

9.  02 

2.  91 

4. 

Science  9A 

50 

36.  44 

5.  77 

5. 

Science  9B 

100 

64.  58 

14.  11 

6. 

Science  10 

100 

63.  14 

15.  40 

7. 

TOUS  I 

18 

10.  14 

2.  63 

8. 

TOUS  II 

18 

11.  12 

2.  43 

9* 

TOUS  III 

24 

10.  89 

3.  10 

10. 

Number  I 

90 

29.  89 

7.  85 

11. 

Number  II 

90 

42.  92 

11.  41 

12. 

Verbal 

36 

21.  25 

5.  04 

13. 

Spatial  I 

70 

38.  17 

12.  93 

14. 

Spatial  II 

70 

45.  04 

13.  16 

15. 

R.  Matrices 

60 

48.  35 

4.  70 

1 6. 

Gen,  Reasoning 

20 

9.  83 

2.  65 

17. 

H~C.  Syllogisms 

30 

19.  97 

6.  15 

18. 

SCAT  Quant. 

50 

40.  38 

6.  01 

19. 

SCAT  Verbal 

60 

49.  11 

7.  29 

20. 

Sc.  Att,  I 

280 

155.  30 

20.  70 

21. 

Sc.  Att.  II 

10.  6 

4.  29 

1.  57 

22, 

Sc.  Att.  II I A 

100 

16.65 

18.  19 

23. 

Sc.  Att.  IIIB 

100 

43.  00 

24.  70 

24. 

Sex 

. 

CORRELATION  MATRIX  FOR  ALL  SUBTESTS  AND  REFERENCE  VARIABLES  * 
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be  regarded  as  significant. 

**  See  text  for  full  names  of  variables. 
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so  close  to  one  that  it  was  decided  to  retain  the  factors  corresponding  to 
them  for  investigation.  Thus  eight  principal-factors  were  extracted  from 
the  analysis,  and  the  question  of  the  significance  of  this  number  of  factors 
was  deferred  till  after  the  rotations  were  completed. 

The  unrotated  factor  matrix  is  shown  in  Table  VII  along  with  the 
communalities  of  the  variables.  At  the  bottom  of  the  table  the  sums  of 
the  squares  of  the  loadings  for  each  factor  are  given,  and  these  are  equal 
to  the  corresponding  eigenvalues.  The  percentage  of  the  common  vari¬ 
ance,  and  of  the  total  variance  taken  out  by  each  factor  is  also  shown. 

The  factors  are  arranged  in  decreasing  order  in  terms  of  the  percent- 
age  of  variance  accounted  for.  These  principal -factor s  account  for  the 
maximum  amount  of  variance  which  is  possible  for  eight  factors. 

Analytical  rotations  were  then  carried  out  on  this  factor  matrix 
using  Kaiser's  normal  varimax  technique.  In  order  to  come  to  a-deci- 
sion  about  the  number  of  common  factors  to  be  counted  as  significant, 
three  rotations  were  performed  with  the  six,  seven,  and  eight  largest 
factors  respectively.  Kaiser  (I960)  argued  that  many  of  the  statistical 
tests  often  used  in  the  past  are  not  appropriate.  He  said  that  the  one 
put  forward  by  Lawley  is  the  only  correct  one,  but  the  computations 
for  this  are  exceedingly  laborious  even  with  computers.  He  contended 
that  in  view  of  the  work  of  Guttman  on  the  algebraic  requirements  of  the 
problem,  the  best  guide  is  to  consider  all  those  eigenvalues  greater  than 
unity,  but  in  addition  he  placed  much  emphasis  on  psychological  meariing- 
fulness  as  a  criterion.  In  the  present  case  rotations  with  six  and  seven 
factors  left  the  first  factor  still  quite  large,  whereas  rotation  with  eight 
factors  caused  a  good  deal  more  change  producing  a  pattern  which 
seemed  to  give  the  best  psychological  meaning.  In  view  of  this  fact  and 
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table  VII 

UNROTATED  FACTOR  MATRIX  FROM  THE  ANALYSIS  OF  ALL 
THE  SUBTESTS  AND  REFERENCE  VARIABLES* 


Common- Factor  Loadings  Commun- 

Variable  - - - -  ality 


I 

11 

III 

IV 

V 

VI 

VII 

VIII 

STEP  Sc.  I 

773 

-285 

-021 

-107 

-084 

060 

016 

098 

712 

STEP  Sc.  II 

706 

-337 

-006 

-077 

-109 

100 

262 

146 

729 

TASK 

596 

-439 

-111 

-016 

-111 

184 

-018 

195 

645 

Science  9A 

758 

-122 

-093 

020 

-319 

-192 

-022 

-066 

742 

Science  9B 

775 

081 

-154 

037 

-211 

-255 

-093 

-080 

756 

Science  10 

763 

167 

-056 

-032 

-181 

-215 

-019 

-190 

729 

TOUS  I 

617 

033 

-161 

-273 

310 

234 

-024 

009 

633 

TOUS  II 

397 

-152 

-219 

-108 

622 

208 

-237 

-017 

727 

TOUS  III 

685 

001 

-185 

-065 

052 

-064 

117 

210 

573 

Number  I 

272 

652 

-241 

451 

-006 

174 

-025 

184 

825 

Number  II 

067 

724 

-292 

391 

-060 

100 

-143 

155 

824 

Verbal 

718 

090 

-363 

-247 

045 

064 

-132 

061 

744 

Spatial  I 

387 

279 

71  7 

-034 

147 

-017 

-022 

321 

868 

Spatial  II 

40  3 

261 

689 

-008 

189 

-102 

-037 

327 

859 

R,  Matrices 

451 

029 

560 

-181 

123 

-042 

079 

-323 

678 

Gen.  Reasoning 

712 

238 

159 

177 

-182 

156 

141 

-077 

703 

H-C.  Syllogism 

514 

228 

195 

-117 

-032 

119 

-249 

-427 

626 

SCAT  Quant. 

695 

293 

241 

129 

-134 

079 

107 

-149 

701 

SCAT  Verbal 

778 

061 

-244 

-176 

024 

102 

-073 

044 

71  9 

Sc,  Att.  I 

-257 

-157 

224 

-201 

-421 

101 

-731 

177 

933 

Sc.  Att,  II 

-332 

201 

140 

-069 

- 1 66 

689 

098 

-227 

738 

Sc,  Att.  Ill  A 

-380 

433 

-037 

-567 

-191 

1 10 

120 

163 

744 

Sc,  Att,  IIIB 

-298 

449 

- 1 66 

-589 

-207 

-068 

240 

130 

788 

Sex 

106 

-628 

173 

244 

-258 

323 

152 

179 

721 

Sums  of 

Squares 

.  603 

2.  612 

2.  041 

1.  428 

1.  180 

1.  015 

0.  933 

0.  905 

17.  717 

%  of  Common  ^ 
Variance 

.  9 

14.  7 

11.5 

8.  1 

6.  7 

5.  7 

5.  3 

5.  1 

100,  0 

%  of  Total  ^ 

Variance 

.  7 

10.  9 

8.  5 

6.  0 

4.  9 

4.  2 

3.  9 

3.  8 

73.9 

*  Decimal  points  have  been  omitted  from  the  factor  loadings. 

Any  comparisons  of  the  loadings  should  only  be  made  in  terms  of  the  number  of 
figures  which  can  be  regarded  as  significant. 
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because  the  eigenvalues  corresponding  to  factors  seven  and  eight  were 
only  slightly  less  than  one,  it  was  decided  to  adopt  this  as  the  solution. 
Table  VIII  shows  the  factor  matrix  which  resulted  from  the  rotation. 

For  the  sake  of  clarity  and  because  of  the  number  of  figures  which 
could  be  counted  as  significant  in  any  value,  factor  coefficients  are 
only  given  to  two  decimal  places  in  this  table,  and  only  those  loadings 
which  are  greater  than  0.  2  are  listed  except  where  there  is  some  special 
interest  in  the  other  loadings. 

Interpretation  of  the  Factors.  With  eight  factors  the  pattern  is  reason¬ 
ably  easy  to  interpret  with  the  assistance  of  the  marker  variables. 

Factor  I  is  interpreted  as  a  verbal-educational  factor  like  the 
v:ed  factor  frequently  found  by  English  analysts.  (Vernon,  1961).  All 
the  tests  of  achievement  in  science  have  high  loadings  on  the  factor 
except  the  Test  of  Understanding  Science  Scale  33,  and  this  can  be 
accounted  for  by  its  high  loading  on  factor  II,  and  it  will  be  taken  up  in 
that  connection.  It  is  apparently  not  a  specialized  science  ability,  how¬ 
ever,  because  the  verbal  tests  also  have  high  loadings  upon  it.  The 
moderate  loadings  of  SCAT  Quantitative  and  the  general  reasoning  test 
also  lend  weight  to  this  interpretation.  These  tests  involve  school  learned 
content  and  though  they  load  on  other  factors  it  is  reasonable  to  expect 
them  to  contribute  to  a  verbal-educational  factor. 

Factor  II  is  best  interpreted  as  a  verbal  factor.  The  Test  of 
Understanding  Science  Scales  I  and  II  have  high  loadings  on  this  factor 
and  Scale  III  also  has  a  loading  on  it.  Thus  these  three  scales  have 
something  in  common  which  causes  them  to  contribute  to  the  factor. 
However,  the  two  verbal  tests  also  have  moderate  loadings  upon  it,  and 
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ROTATED  FACTOR  MATRIX  FROM  THE  ANALYSIS  OF  ALL 
THE  SUBTESTS  AND  THE  REFERENCE  VARIABLES* 


Variable 

Common-Factor  Loadings 

Commun- 

ality 

I 

II 

III 

IV  V 

VI 

VII 

VIII 

STEP  Sc.  I 

77 

18 

11 

-19 

712 

STEP  Sc.  II 

77 

07 

-02 

-17 

730 

TASK 

70 

17 

-11 

-30 

645 

Science  9A 

75 

31 

03 

20 

742 

Science  9B 

68 

38 

20 

31 

757 

Science  10 

62 

50 

16 

24 

730 

TOUS  I 

47 

59 

16 

12 

633 

TO  US  II 

17 

81 

03 

06 

-20 

726 

TOUS  III 

66 

43 

18 

1  7 

18 

573 

Number  I 

88 

826 

Number  II 

90 

823 

Verbal 

68 

43 

18 

743 

Spatial  I 

92 

867 

Spatial  II 

91 

860 

R.  Matrices 

14 

55 

“29  48 

16 

678 

Gen.  Reasoning 

54 

38 

29  31 

1  7 

703 

H-C,  Syllogism 

23 

70 

627 

SCAT  Quant. 

46 

48 

26  37 

1  7 

700 

SCAT  Verbal 

70 

39 

22 

719 

Sc.  Att.  I 

“94 

933 

Sc.  Att.  II 

-28 

-78 

739 

Sc,  Att,  III  A 

-18 

81 

743 

Sc.  Att.  IIIB 

87 

786 

Sex 

31 

-21 

“32 

-30 

-39 

-49 

721 

Sums  of 

Squares  5 

.  55 

1.  58 

1.  94 

2.  1  5  2.  28 

1.12 

1.  13 

1.  97 

17.  716 

%  of  Common  ^ 
Variance 

.  3 

8.  9 

11.0 

12.1  12.9 

6.  3 

6.4 

11.  1 

100.  0 

%  of  Total  23 

Variance 

.  1 

6.  6 

8.  1 

9.  0  9.  5 

4.  6 

4.  7 

8.  2 

73.  8 

*For  clarity  loa< 
Decimal  points 
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it  seems  to  be  accounting  for  verbal  ability  over  and  above  that  involved 
in  factor  I.  This  interpretation  would  be  supported  by  noting  that  male 
sex  is  negatively  loaded  on  the  factor,  that  is,  girls  tend  to  do  better 
upon  this  ability.  Vernon  (I960)  has  said  that  factorial  evidence  indi¬ 
cates  that  girls  do  a  little  better  than  boys  on  most  verbal  tests.  On 
examining  the  Test  of  Understanding  Science,  it  is  noted  that  items  31 
to  3  7  inclusive,  requiring  a  more  complicated  procedure  than  the  other 
items,  all  occur  in  Scale  II.  Also  many  of  the  items  require  a  good  deal 
of  comparison  of  phrases  and  words,  and  the  stems  or  responses  are 
often  quite  long.  Such  examination  of  the  test  supports  the  evidence 
above  concerning  the  nature  of  this  factor. 

Factor  111  is  a  broad  reasoning  factor  in  line  with  postulate  2  of 
Chapter  III  that  such  a  reasoning  factor  could  be  obtained  by  the  inclusion 
of  reference  tests  for  induction,  general  reasoning,  and  deduction.  ‘  The 
three  reasoning  marker  tests  all  load  on  the  factor.  The  SCAT  Quanti¬ 
tative  measure  also  loads  on  it,  and  this  is  quite  understandable  because 
half  of  that  test  involves  items  of  the  mathematics  problem-solving 
variety  which  are  commonly  found  to  load  on  a  general  reasoning  factor. 

A  point:  of  considerable  interest  in  regard  to  this  factor  is  that  the  school 
science  tests  load  much  more  heavily  than  the  tests  selected  as  measures 
of  problem-solving  ability.  This  aspect  of  the  analysis  will  be  taken  up 
again. 

Factor  IV  is  clearly  a  factor  covering  facility  in  numerical  compu¬ 
tation- -the  N  factor  described  by  Thur stone.  The  test  of  general  reason¬ 
ing  and  the  SCAT  Quantitative  test,  both  involving  mathematical  problems, 


have  values  appearing  on  tils  factor  as  would  be  expected. 
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The  two  spatial  marker  tests  define  the  factor  V.  It  is  a  spatial 
factor,  and  it  is  found  to  account  for  some  of  the  variance  of  the  Progres¬ 
sive  Matrices  test  as  other  investigators  have  found  (Vernon,  1961). 

Factor  VI  is  an  attitude  factor  largely  defined  by  the  Scientific 
Attitude  Scale  I,  Rokeach's  Dogmatism  Scale,  which  has  a  very  high 
factor  coefficient  in  this  case.  Since  high  scores  on  that  scale  are 
regarded  as  an  indication  of  dogmatism,  the  use  of  a  negative  sign  on 
the  loading  for  the  variable  allows  the  factor  to  be  labelled  open-minded¬ 
ness.  Other  tests  have  small  or  negligible  weights  on  this  factor.  It  is 
of  interest  that  two  of  the  reasoning  tests,  and  the  SCAT  Quantitative 
variable  which  also  measures  reasoning,  have  positive  loadings  though 
they  are  small.  Thus  there  is  a  suggestion  that  the  open-minded  subject 
has  better  reasoning  ability  than  the  dogmatic  person. 

The  scientific  attitude  scale  IX  makes  the  major  contribution  to 
the  Factor  VII.  This  scale  purports  to  measure  the  attitude  of  subjects 
towards  investigation  and  discovery  of  knowledge.  When  this  is  negatively 
weighted  the  factor  would  imply  a  favorable  attitude  since  favorability  cor¬ 
responds  to  low  scores  on  the  scale.  It  would  be  quite  consistent  to  find 
that  those  who  have  a  favorable  attitude  of  this  kind  also  do  better  in 
school  science  as  the  factor  coefficients  for  those  variables  indeed  indi¬ 
cate.  The  fact  that  the  scale  II  also  has  a  favorable  loading  on  factor  I, 
as  well  as  the  values  for  the  school  science  tests  on  factor  VII,  suggests 
that  an  attitude  to  school  may  be  related  to  what  the  scientific  attitude 
scale  II  is  measuring.  If  this  is  an  attitude -to -know!  edge  factor,  then 
the  loading  on  sex  suggests  that,  on  the  average,  girls  in  this  sample 
have  a  more  favorable  attitude  to  knowledge. 
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Factor  VIII  is  defined  by  the  two  scales  designed  to  measure  super¬ 
stitious  belief.  Positive  loadings  on  those  scales  indicate  superstitious  - 
ness  in  terms  of  the  method  by  which  the  scoring  is  carried  out.  The  load¬ 
ing  on  sex  would  indicate  that  girls  are  more  superstitious  than  boys,  in 
line  with  previous  evidence.  It  is  of  interest  to  note  that  the  Test  of  Under- 
standing  Science  Scale  II,  "Understandings  about  Scientists,  "  has  a  negative 
loading  on  the  factor.  This  would  suggest  that  superstitious  persons  tended 
to  have  unfavorable  attitudes  to  scientists.  The  reasons  for  the  loadings  of 
the  three  tests  selected  as  measures  of  problem-solving  ability  on  the 
factor  are  not  clear.  Tentatively,  the  factor  may  be  regarded  as  measur¬ 
ing  superstitiousness. 

The  Analysis  of  the  Complete  Tests  and  the  Reference  Variables.  The 
Factor  Pattern.  Since  the  reliabilities  of  the  complete  tests  would  be 
higher  than  those  of  the  subtests,  it  was  likely  that  the  factor  loadings 
of  these  tests  would  be  higher  when  the  full  tests  were  used.  Thus  total 
scores  for  STEP  Science  and  for  the  Test  of  Understanding  Science  were 
employed  in  another  analysis  in  an  attempt  to  augment  the  evidence  pro¬ 
vided  by  the  first  analysis.  In  this  case  scientific  attitude  IIIA  was  not 
included  since  it  seemed  to  be  measuring  the  same  thing  as  the  scale  IIIB 
and  probably  less  reliably. 

Since  hypothesis  3  was  concerned  with  examinations  used  in 
Edmonton  schools,  it  was  desirable  that  investigations  of  these  exami¬ 
nations  should  be  made  upon  a  representative  sample.  This  was  done  in 
the  second  analysis.  The  correlation  coefficients  were  recalculated, 
and  are  shown  in  Table  IX.  It  will  be  noted  that  they  are  very  similar 
to  the  first  set.  They  are,  in  general,  somewhat  smaller  because  the 
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figures  which  can  be  regarded  as  significant 
**  See  text  for  full  names  of  variables. 
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sample  is  more  homogeneous  in  regard  to  general  intellectual  ability  than 
that  used  for  the  first  analysis. 

The  principal-factor  analysis  was  then  performed  on  the  correla¬ 
tion  matrix  and  the  factor  matrix  obtained  is  shown  in  Table  X.  Table  XI 
sets  out  the  rotated  factor  matrix  after  the  varimax  rotation.  Five  eigen¬ 
values  greater  than  one  were  obtained,  but  a  sixth  was  only  very  slightly 
smaller  than  one,  its  value  being  0.  992.  The  next  largest  eigenvalue 
was  0.  889.  Three  rotations  were  carried  out  using  the  five,  six,  and 
seven  largest  factors  respectively.  It  was  found  that  seven  factors  gave 
the  most  meaningful  solution,  and  one  that  was  similar  to  the  first  analysis. 

Interpretation  of  the  Factors.  The  most  noticeable  change  in  the  analysis 
is  that  the  verbal-educational  factor  has  disappeared  and  its  variance  has 
been  distributed  mainly  between  the  verbal  and  reasoning  factors. 

An  interesting  result  is  that  the  school  science  examinations  have 
even  higher  factor  coefficients  for  the  reasoning  factor.  Although  STEP 
Science  also  has  a  higher  loading  on  this  factor  than  on  the  corresponding 
factor  in  the  first  analysis,  its  value  is  considerably  smaller  than  that  for 
the  school  science  examinations. 

Most  of  the  variance  of  the  Test  of  Understanding  Science  is 
attributable  to  factor  I.  In  view  of  the  high  loadings  which  the  verbal 
reference  tests  have  on  this  factor,  it  is  a  clear  verbal  factor.  This 
seems  to  support  the  interpretation  from  the  first  analysis  that  this  Test 
of  Understanding  Science  is  very  much  dependent  upon  the  verbal  ability 
of  the  subjects.  The  test  has  very  small  loadings  on  reasoning  and  its 
attitude  component  is  concentrated  in  the  factor  defined  by  the  test  of 
open-mindedness  in  the  second  analysis.  It  is  interesting  that  this 


108 


TABLE  X 

UNROTATED  FACTOR  MATRIX  FROM  THE  ANALYSIS  OF  THE 
COMPLETE  TESTS  AND  THE  REFERENCE  VARIABLES 
FOR  THE  REPRESENTATIVE  SAMPLE  (N  =  166)* 


Variable  Common- Factor  Loadings  Commun- 


cuj .ity 

I 

II 

III 

IV 

V 

VI 

VII 

STEP  Sc. 

725 

-428 

-106 

-035 

016 

071 

1 14 

740 

TASK 

535 

-468 

-208 

-155 

122 

001 

189 

623 

Science  9A 

698 

-210 

-208 

043 

078 

-309 

-170 

707 

Science  9B 

751 

028 

-226 

088 

000 

-336 

-153 

761 

Science  10 

738 

111 

-079 

161 

-038 

-216 

-320 

740 

TOUS 

695 

-029 

-171 

093 

-117 

290 

301 

71 1 

Number  I 

276 

708 

-158 

-446 

032 

-062 

129 

822 

Number  II 

095 

796 

-1  71 

-320 

048 

-199 

055 

819 

Verbal 

710 

122 

-345 

250 

103 

153 

279 

812 

Spatial  I 

383 

121 

770 

-004 

-094 

-122 

308 

8  73 

Spatial  II 

439 

103 

720 

003 

-188 

-174 

287 

869 

R.  Matrices 

437 

-132 

553 

222 

-050 

200 

-264 

675 

Gen.  Reasoning 

707 

142 

146 

-306 

1  74 

108 

-150 

700 

H-C.  Syllogism 

486 

131 

22  5 

160 

297 

226 

-242 

528 

SCAT  Quant. 

734 

170 

253 

-190 

059 

031 

-264 

741 

SCAT  Verbal 

771 

052 

-251 

136 

073 

153 

231 

761 

Sc.  Att.  I 

-209 

-219 

156 

21  6 

604 

-535 

215 

859 

Sc.  Att.  II 

-277 

178 

194 

-109 

658 

350 

018 

712 

Sc.  Att.  IIIB 

-192 

453 

-077 

519 

147 

026 

010 

540 

Sex 

008 

-651 

040 

-541 

148 

-002 

-006 

740 

Sums  of 
Squares 

6.  040 

2.  429 

2.  041 

1.  277 

1.  065 

0.  992 

0.  889 

14. 

733 

%  of  Common 
Variance 

41.  0 

1  6.  5 

13.  9 

8.7 

7  7 

S  •  u 

6.  7 

6.  0 

100. 

0 

%  of  Total 
Variance 

30.  2 

12.  1 

10.  2 

6.  4 

5.  3 

5.  0 

4.  5 

73. 

7 

^Decimal  points  have  been  omitted  from  the  factor  loadings. 

Any  comparisons  of  the  loadings  should  only  be  made  in  terms  of  the  number 
of  figures  which  can  be  regarded  as  significant. 
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TABLE  XI 

ROTATED  FACTOR  MATRIX  FROM  THE  ANALYSIS  OF  THE 
COMPLETE  TESTS  AND  THE  REFERENCE  VARIABLES 
FOR  THE  REPRESENTATIVE  SAMPLE  (N  =  166)* 


Variable 

Common-Factor  Loadings 

Commun- 

ality 

I 

II 

III 

IV 

V 

VI 

VII 

STEP  Sc. 

66 

26 

20 

-39 

740 

TASK 

58 

10 

12 

-48 

623 

Science  9A 

45 

45 

01 

-22 

46 

-20 

707 

Science  9B 

47 

46 

18 

-16 

52 

-03 

760 

Science  10 

35 

62 

11 

-03 

46 

08 

740 

TOUS 

78 

12 

24 

710 

Number  I 

88 

823 

Number  II 

87 

22 

819 

Verbal 

85 

19 

17 

812 

Spatial  I 

91 

873 

Spatial  II 

91 

870 

R.  Matrices 

05 

56 

-37 

45 

17 

-02 

675 

Gen.  Reasoning 

33 

62 

30 

21 

10 

26 

699 

H-C.  Syllogism 

23 

63 

-13 

527 

SCAT  Quant. 

24 

70 

23 

30 

16 

743 

SCAT  Verbal 

81 

26 

760 

Sc.  Att.  I 

-89 

859 

Sc.  Att,  II 

14 

-18 

-79 

713 

Sc.  Att.IILB 

71 

540 

Sex 

-18 

-09 

-83 

739 

Sums  of  ^ 

Squares 

59 

2.  67 

2.  00 

2.  09 

1.  08 

1.  49 

1.  81 

14. 732 

%  of  Common  ^ 
Variance 

3 

18.  2 

13.  6 

14.  2 

7.  3 

10.  1 

12.  3 

100.  0 

%  of  Total  i  j 

Variance 

177 "" T  1  »  i  T  "I’  1 

9 

13.  4 

10.  0 

10.  4 

5.  4 

7.  5 

9.  1 

73.  7 

*For  clarity  loadings  below  0.  2  are  omitted  except  in  special  cases. 

Decimal  points  are  omitted  from  the  loadings. 
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pattern  regarding  attitudes  is  different  from  both  the  STEP  Science  and 
school  examinations. 

Factors  V  and  VI  correspond  to  factors  VI  and  VII  in  the  first 
analysis.  Factor  V  throws  no  further  light  on  the  previous  factor  VI. 

Factor  VI  in  the  new  analysis  has  lost  the  contribution  made  from  the 
sex  variable  to  its  counterpart,  and  it  shows  higher  loadings  on  the 
school  examinations.  It  may  be  that  this  factor  is  related  to  the  X  factor 
which  Vernon  (1961)  has  discussed.  He  said  that  this  somewhat  ill- 
defined  factor  of  industriousness  plus  interest  is  often  found  in  educa¬ 
tional  attainments  especially  when  measured  by  school  examinations. 

The  common-factor  labelled  "VII”  is  obviously  related  to  the 
previous  factor  VIII  which  was  interpreted  as  a  factor  for  superstitious¬ 
ness.  In  the  second  analysis  the  sex  variable  provides  the  major  contri¬ 
bution.  The  positive  loading  of  the  superstition  test  again  indicates  that 
girls  are  more  superstitious  than  boys.  STEP  Science  and  the  Test  of 
Application  of  Scientific  Knowledge  now  have  substantial  loadings  on  this 
factor.  Carey  (1958)  cited  evidence  that,  on  the  average,  boys  do  better 
in  problem-solving  than  girls  do,  thus  if  the  two  tests  could  be  regarded 
as  effective  measures  of  problem-solving  these  loadings  would  be  explain¬ 
able  in  those  terms.  In  view  of  the  low  coefficients  for  the  reasoning 
factor,  another  explanation  seems  preferable.  Vernon  (I960)  reported 
that  informational  test  items  show  a  considerable  sex  difference  in  favour 
of  males.  Since  STEP  Science  and  especially  the  test  of  application  involve 
learned  information  related  to  new  situations,  if  is  reasonable  to  believe 
that  those  who  have  the  widest  range  of  information  available  will  do  best 
on  these  tests.  There  is  a  further  point  which  should  be  noted  here.  An 


. 

'  -  J.c  ■: 


■ 


'5 


Ill 


examination  of  the  Test  of  Application  of  Scientific  Knowledge  will  show 
that  many  of  the  items  refer  to  males  and  their  interests.  It  is  possible 
that  this  may  have  contributed  to  the  loading  of  this  test  on  male  sex. 
Thus,  in  the  second  analysis  factor  VII  is  interpreted  primarily  as  a 
sex  factor. 

Further  Analysis  of  the  STEP  Science  Test 

In  order  to  test  hypothesis  5,  an  analysis  of  STEP  Science  based 
on  groupings  provided  by  the  Teacher's  Guide  (1959)  was  made.  The 
Guide  gave  the  allocations  of  the  items  to  the  six  abilities  which  the 
publishers  claimed  to  measure  as  the  main  types  of  scientific  reasoning. 
The  names  for  these  and  the  number  of  items  in  each  are  shown  in  the 


following  list: 

1.  Ability  to  identify  and  define  scientific  problems.  3  items 

2.  Ability  to  suggest  and  screen  hypotheses.  33  items 

3.  Ability  to  select  valid  procedures.  8  items 

4.  Ability  to  interpret  data  and  draw  conclusions.  6  items 

5.  Ability  to  evaluate  critically  claims  or 

statements  made  by  others.  4  items 

6.  Ability  to  reason  quantitatively  and  symbolically.  6  items 


One  item  was  allocated  to  both  1  and  2  in  the  Guide;  it  was  used 
in  1  only  for  this  analysis.  An  item  which  was  allocated  to  4  and  6  was 
used  in  4  only. 

Scores  were  obtained  for  the  sample  of  166  students  for  each  of 
these  six  subtests  of  scientific  reasoning.  Using  these  scores,  the  six 
variables  were  intercorrelated  and  factor  analyzed.  Because  of  the 
small  number  of  variables  involved,  unities  were  not.  suitable  approxi¬ 
mations  to  the  communal itie s.  The  squared  multiple  correlations  were 
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used  in  this  case.  In  order  to  obtain  these  the  inverse  of  the  correlation 
matrix  was  calculated.  Use  was  then  made  of  the  fact  that  the  squared 
multiple  correlation  is  one  minus  the  reciprocal  of  the  corresponding 
diagonal  element  of  the  inverse  matrix.  These  values  are  shown  in 
parentheses  in  the  diagonal  cells  of  the  correlation  matrix.  This  matrix 
and  the  unrotated  factor  matrix  are  shown  in  Table  XII. 

Interpretation.  Three  eigenvalues  greater  than  zero  were  obtained  from 
the  factor  analysis.  An  examination  of  the  sums  of  squares  indicates 
that  the  first  eigenvalue  accounts  for  more  than  the  starting  communality. 
Its  value  of  1.  571  is  greater  than  the  sum  of  the  squared  multiple  corre¬ 
lations  which  is  1.  399.  In  circumstances  such  as  these  Harman  (I960) 
has  argued  that  only  the  first  factor  is  of  practical  significance. 

This  factor  only  accounts  for  26.  2  per  cent  of  the  total  variance. 
This  is  a  reflection  of  the  low  correlations  between  the  subtests.  It  is 
not  possible  to  draw  a  firm  conclusion  about  the  subtests  on  the  basis  of 
this  evidence.  It  could  be  that  the  low  correlations  are  due  to  the  sub- 
tests  measuring  specific  abilities  which  are  different  from  one  another. 

A  more  likely  explanation  is  that  the  subtests  have  low  reliabilities  and 
that  what  is  involved  over  and  above  the  first  factor  is  primarily  error. 
This  explanation  suggests  itself  in  view  of  the  small  numbers  of  items 
in  the  categories. 

The  Question  of  Abilities  Specific  to  the  Different  Sciences 

A  postulate  basic  to  this  investigation  was  that  at  the  high  school 
level  abilities  in  different  sciences  such  as  physics,  chemistry,  and 
biology  do  not  determine  individual  differences  in  performance.  General 
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^Decimal  Points  are  omitted  from  correlation  coefficients,  factor  loadings  and  communalities. 
Estimates  for  the  communalities  are  shown  in  parentheses  in  the  correlation  matrix. 
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intellectual  ability  is  regarded  as  the  dominating  influence.  The  clarity 
of  the  factor  patterns  in  the  first  two  analyses  suggests  that  this  is  a 
sound  position  to  take.  However,  a  further  analysis  was  made  in  an 
attempt  to  throw  more  light  upon  this  question. 

The  items  of  the  STEP  test  were  grouped  on  the  basis  of  the 
sciences--biology,  chemistry,  physics,  and  meteorology  and  scores 
for  the  166  students  obtained  on  each  group.  Scores  on  these  subtests 
along  with  two  of  the  school  science  tests,  Science  9A  and  Science  10 
were  intercor related  and  factor  analyzed.  Three  eigenvalues  greater 
than  zero  were  obtained.  The  correlation  matrix  and  the  factor  matrix 
are  set  out  in  Table  XIII. 

Interpretation.  Again  in  this  analysis  the  first  factor  accounts  for  more 
than  the  starting  communality  the  value  being  2.  309  as  against  2.  147  for 
the  sum  of  the  squared  multiple  correlations,  and  therefore  it  alone  is 
of  practical  significance. 

This  factor  accounts  for  38.  5  per  cent  of  the  total  variance.  It 
is  possible  that  specific  abilities  are  operative,  but  in  view  of  the  size  of 
the  factor  obtained,  and  of  the  small  numbers  of  items  in  some  of  the 
groups  (21,  9,  1  6,  and  8)  which  probably  mean  that  these  subtests  do 
not  have  high  reliabilities,  it  seems  very  likely  that  a  general  ability 
applicable  to  all  the  sciences  is  the  most  important. 
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CORRELATION  MATRIX  AND  UNROTATED  FACTOR  MATRIX  FOR  THE  STEP  SUBTESTS 
FOR  BIOLOGY,  CHEMISTRY,  PHYSICS  AND  METEOROLOGY  TOGETHER 

WITH  SCIENCE  9A  AND  SCIENCE  10* 
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*  Decimal  Points  are  omitted  from  correlation  coefficients,  factor  loadings  and  communalities. 
Estimates  for  the  communalities  are  shown  in  parentheses  in  the  correlation  matrix. 


CHAPTER  VII 


FINDINGS  RELATED  TO  THE  HYPOTHESES 
AND  THE  TESTS 


Chapter  VI  has  been  concerned  with  the  details  of  the  analysis. 
In  order  to  draw  out  some  of  the  implications  of  this  its  significance 
for  the  hypotheses  and  the  tests  will  now  be  examined. 


The  Findings  About  the  Hypotheses 


Hypothesis  1;  Certain  tests  of  problem-solving  ability  in  science  mea¬ 
sure  ability  to  reason  with  the  knowledge  which  they  require  to  be 
recalled. 

The  hypothesis  was  not  substantiated  in  terms  of  the  instruments 
used  in  the  investigation.  It  was  found  that  the  STEP  Science  test  had 
quite  a  low  loading  on  the  reasoning  factor.  The  Test  of  Application  of 
Scientific  Knowledge  had  an  even  lower  loading  and  this  is  understand¬ 
able  in  view  of  its  lower  reliability. 


Hypothesis  2:  Certain  tests  of  scientific  attitude  measure  different 
attitudes  or  one  attitude,  but  not  abilities. 


The  tests  of  scientific  attitude  were  found  to  be  related  to  abilities 
to  a  rather  limited  extent  as  was  predicted.  Each  test  had  a  high  loading 
on  a  separate  attitude  factor  and  thus  the  hypothesis  was  upheld. 


Hypothesis  3:  Examinations  in  science  used  by  the  Alberta  Department 
of  Education  and  Edmonton  schools  at  the  grade  IX  and  grade  X  levels 
respectively,  measure  knowledge  primarily.  They  measure  problem¬ 
solving  ability  to  a  small  extent  and  do  not  measure  scientific  attitudes. 


The  analysis  showed  little  differentiation  between  Sections  A  and 
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B  of  the  Departmental  examination.  In  each  case  moderately  sized  factor 
coefficients  were  obtained  on  both  verbal  and  reasoning  factors.  It  was 
predicted  that  the  two  sections  of  the  examination  would  correlate  highly 
because  both  were  measuring  knowledge  mainly.  A  high  correlation 
coefficient  (0.  668)  was  obtained,  but  this  was  due  to  the  contributions 
from  both  verbal  and  reasoning  factors.  The  reasoning  factor  also  played 
quite  a  prominent  role  in  the  Science  X  examinations,  and  in  this  case  the 
loading  was  high  (0.  62).  The  school  examinations  showed  low  or  negligible 
loadings  on  two  of  the  attitude  scales  but  intermediate  values  were  obtained 
on  the  factor  defined  by  the  scale  designed  to  measure  attitude  to  investi¬ 
gation  and  discovery  of  knowledge.  This  was  interpreted  as  meaning  that 
the  attitude  concerned  is  important  to  success  in  school  science,  rather 
than  that  the  examinations  measure  the  attitude  in  any  predictable  way. 
Thus,  the  hypothesis  was  only  partially  substantiated.  It  was  concluded 
that  the  Departmental  and  local  school  examinations  do  not  measure 
scientific  attitudes.  However,  it  was  also  concluded  that,  in  so  far  as 
the  basic  postulate  of  the  significance  of  reasoning  ability  is  sound, 
Departmental  and  local  school  examinations  measure  problem-solving 
ability  as  well  as  the  acquisition  of  knowledge. 

Hypothesis  4:  At  the  grade  X  level,  there  is  no  unique  problem-solving 
ability  measured  by  the  tests  in  this  battery  and  selected  as  measures 
of  the  problem-solving  objective.  The  tests  involve  mainly  verbal  and 
reasoning  factors. 

There  was  no  evidence  from  this  analysis  that  the  STEP  Science 
test  and  the  Test  of  Application  of  Scientific  Knowledge  measure  an 
unique  problem-solving  ability.  Therefore,  the  hypothesis  was  substanti- 
atedexcept  in  so  far  as  the  tests  were  not  good  measures  of  reasoning 
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ability. 


Hypothesis  5:  Whilst  the  STEP  Science  test  measures  a  broad  reasoning 
ability,  it  does  not  measure  reliably,  "ability  to  identify  and  define 
scientific  problems,  "  "ability  to  suggest  and  screen  hypotheses,  "  "ability 
to  select  valid  procedures,  "  "ability  to  interpret  data  and  draw  conclu¬ 
sions,  "  "ability  to  evaluate  critically  claims  or  statements  made  by 
others,  "  and  "ability  to  reason  quantitatively  and  symbolically.  " 


As  was  predicted  the  analysis  of  the  scientific  reasoning  subtests 
produced  a  general  factor  which  accounted  for  most  of  the  common  vari¬ 
ance.  However,  this  factor  did  not  account  for  sufficient  of  the  total 
variance  to  support  a  firm  conclusion  that  the  scientific  reasoning 
abilities  are  not  measured.  Nevertheless,  the  small  numbers  of  items 
in  most  of  the  categories  would  suggest  that  the  subtests  do  not  have  high 
reliability,  and  it  seems  likely  that  if  the  subtests  were  longer  a  general 
factor  would  still  be  the  most  important. 

Hypothesis  6:  Success  in  science  is  partly  dependent  upon  a  special 
scientific  ability. 


No  factor  emerged  from  the  analysis  having  intermediate  or  high 

coefficients  on  the  science  achievement  tests  but  negligible  or  low  load- 

) 

ings  on  other  factors.  Therefore,  there  was  no  evidence  in  this  investi¬ 
gation  of  a  special  scientific  ability,  and  the  hypothesis  was  not  upheld. 

Hypothesis  7:  Tests  of  understanding  science  measure  knowledge, 
ability  to  reason  with  that  knowledge,  and  attitudes. 

The  Test  of  Understanding  Science  was  found  to  have  a  high  verbal 
loading  as  was  predicted.  The  loadings  on  reasoning  and  attitude  factors 
were  Low.  Thus  the  hypothesis  was  only  partially  substantiated.  It 
seems  that  the  test  used  measures  mainly  knowledge  and  verbal  ability, 
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and  involves  attitudes  to  a  limited  extent. 

Hypothesis  8:  Tests  of  scientific  attitude  components  overlap  sufficiently 
to  produce  a  common  hypothetical  variable  which  can  be  called  "the 
scientific  attitude.  " 

The  analysis  showed  that  there  is  not  sufficient  overlap  between 
the  three  attitude  measures  used  in  this  investigation  to  define  a  common 
factor  which  might  be  called  "the  scientific  attitude.  "  The  evidence  would 
indicate  that  the  hypothesis  is  not  to  be  upheld.  However,  in  view  of  the 
fact  that  the  superstition  test  is  so  strongly  related  to  the  sex  variable, 
investigation  of  more  measures  of  the  suggested  components  of  scientific 
attitude  would  need  to  be  carried  out  to  draw  a  firm  conclusion. 

The  Findings  About  the  Tests 

The  Test  of  Knowledge.  The  test  selected  primarily  as  a  measure  of 
knowledge,  Section  A  of  the  Grade  IX  Departmental  Science  examination, 
was  found  to  have  coefficients  for  the  verbal  and  reasoning  factors  almost 
identical  with  Section  B  of  the  same  examination.  The  coefficient  on  the 
reasoning  factor  was  intermediate  in  size  (0.  45).  The  evidence  of  past 
research  is  that  a  test  of  simple  recall  would  load  mainly  on  the  verbal 
factor  as  was  postulated  in  Chapter  HI.  It  has  been  mentioned  that  of  the 
forty  eight  items  in  Section  A,  eleven  required  some  application  though 
this  was  of  a  very  simple  kind.  The  most  reasonable  explanation  of  the 
presence  of  the  reasoning  factor  in  Science  9A  is  that  the  eleven  items 
requiring  some  application  were  the  items  which  provided  most  of  the 
discrimination  between  the  students.  Unfortunately,  the  item  scores 
were  not  available  for  this  suggestion  to  be  checked.  If  it.  is  true,  it  has 
interesting  implications  in  view  of  the  simple  type  of  application  involved. 


120 


The  Tests  of  Problem-Solving.  The  factor  pattern  indicated  that  the 
constructed  test  measures  much  the  same  abilities  as  the  STEP  Science 
test  upon  which  it  was  modelled.  It  had  loadings  on  the  same  factors, 
and  the  differences  in  the  sizes  of  these  coefficients  can  be  attributed 
to  the  lower  reliability  of  the  constructed  test.  Therefore  it  would 
appear  that  if  the  STEP  Science  test  were  measuring  an  unique  problem- 
solving  ability,  a  factor  corresponding  to  this  should  have  been  identified. 

The  most  interesting  finding  about  these  tests  was  their  high  load¬ 
ings  on  the  verbal  factor  and  their  unexpectedly  low  loadings  on  the 
reasoning  factor.  Evidently  the  tests  place  so  much  emphasis  upon  the 
verbal  processes  involved  in  comprehending  the  meaning  of  the  stems 
and  responses  and  in  comparing  words  in  the  responses  one  with  another 
that  the  person  does  not  become  very  involved  in  manipulating  the  ideas 
conveyed.  The  following  is  a  typical  item  from  the  STEP  test  and  an 
inspection  of  it  would  suggest  that  the  foregoing  explanation  is  not 
unreasonable : 


PART  TWO 
Questions  1-5 

Your  family  owns  a  cattle  ranch.  One  of  the  wells  no  longer 
pumps  water,  and  you  go  out  with  the  hired  man  to  help  him  fix 
it. 

1.  The  electrically  driven  lift  pump  was  designed  to  draw  water 
up  by  "suction.  "  If  you  found  the  pump  motor  running  but  no 
water  being  drawn  up,  you  might  reasonably  have  suspected 
any  of  the  following  EXCEPT 

A.  a  leak  in  the  suction  pipe  under  the  pump 

B.  an  unusually  high  barometric  pressure 

C.  low  water  in  the  well  at  a  level  more  than  34  feet  below 
the  pump 

D.  a  broken  drive  shaft  between  the  motor  and  the  pump. 
(1957,  p.  6) 
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It  is  interesting  that  the  syllogism  test  had  a  high  reasoning  loading  since 
it  is  a  test  expressed  only  in  verbal  form.  The  statements  in  that  test 
are  short  and  use  very  simple  language  and  apparently  allow  greatest 
emphasis  to  be  placed  upon  the  relationships  between  the  ideas  conveyed. 

It  seems  justifiable  to  conclude  that  the  STEP  Science  test  is  not  a  good 
measure  of  problem-solving  ability  in  regard  to  the  reasoning  aspect 
of  that  objective. 

A  point  that  needed  some  clarification  was  whether  or  not  the 
content  of  the  STEP  test  was  such  that  it  is  less  suitable  for  students  in 
Edmonton  than  for  students  in  the  United  States.  Its  reliability  was  esti¬ 
mated  by  the  Kude r-Richardson  Formula  20.  A  value  of  0.  77  was 
obtained.  This  is  to  be  compared  with  the  value  given  in  the  Technical 
Report  (1957).  That  value  was  0.81  also  determined  by  the  Kuder- 
Richardson  Formula  20.  The  correlations  with  the  SCAT  total  score 
were  also  compared.  The  reported  value  was  0.  65  and  the  value  obtained 
in  the  present  study  was  0.  57.  For  both  reliability  and  correlation  with 
SCAT  total  score  the  values  are  slightly  lower,  but  it  is  unlikely  that  the 
differences  are  so  great  as  to  mean  that  the  STEP  test  would  have  an 
appreciably  higher  loading  on  reasoning  if  a  similar  investigation  were 
carried  out  in  the  United  States. 

The  Test  of  Understanding  Science.  A  large  coefficient  on  the  verbal 
factor  was  obtained  for  this  test.  This  was  so  pronounced  that  it  seemed 
to  be  responsible  for  defining  a  separate  verbal  factor  in  the  first  analysis. 
It  is  interesting  to  note  that  in  the  first  analysis  Scales  I  and  II  loaded 
most  heavily  on  factor  II,  and  that  Scale  III  loaded  most  heavily  on  factor 
I,  which  is  an  educational  factor.  A  possible  interpretation  of  this  is 
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that  school  courses  give  information  about  the  procedures  of  science 
(Scale  III)  but  not  too  much  about  scientists  (Scale  II),  and  the  role  of 
scientific  societies,  and  the  scientific  enterprise  generally  (Scale  I). 
There  was  evidence  of  small  attitude  components  in  the  test.  These 
suggested  that  the  person  who  is  open-minded  and  not  superstitious  tends 
to  score  slightly  higher.  There  was  also  a  suggestion  that  the  person 
who  is  least  superstitious  tends  to  have  a  more  favorable  attitude 
towards  scientists.  The  reasoning  factor  accounted  for  a  very  small 
amount  of  the  variance  for  the  test.  In  view  of  this,  the  question  is 
raised  as  to  whether  it  would  be  possible  to  achieve  the  same  purpose 
as  the  test  achieves  with  somewhat  simpler  items.  Items  31  to  3  7  for 
example  seem  to  be  unnecessarily  complex.  If  items  with  simpler  stems 
and  responses  could  be  constructed  to  cover  the  same  subject  matter 
then  the  reliability  would  probably  be  raised  from  the  present  value  of 
0.  76. 

Some  of  the  students  in  the  project  found  the  test  very  interest¬ 
ing,  and  measures  of  these  wider  understandings  should  be  very  useful 
to  teachers. 

The  Attitude  Scales.  The  three  attitude  scales  measured  something 
different  from  one  another  and  different  from  the  abilities  sampled  in 
this  analysis.  This  provides  some  evidence  for  their  validities.  It  is 
unlikely  that  the  differences  are  just  due  to  differences  in  the  types  of 
items  and  the  methods  of  response.  The  response  procedure  of  the 
dogmatism  scale  is  similar  to  the  Likert  type  scale,  and  Edwards  (1  957) 
cited  evidence  that  there  is  high  correlation  between  attitude  statements 
scaled  by  this  method  and  by  the  equal-  appearing  interval  method  vised 
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in  Scale  II. 

The  loading  of  the  Test  of  Understanding  Science  on  the  factor 
defined  by  the  dogmatism  scale  was  interpreted  as  meaning  that  the 
open-minded  person  has  a  better  understanding  of  science  as  a  human 
activity  and  of  scientists. 

It  is  suggested  that  the  analysis  lends  support  to  the  claim  that 
this  scale  measures  the  open  and  closed  mind  dimension  which  it  claims 
to  do. 

The  evidence  from  this  investigation  suggests  that  the  Mailer  and 
Lundeen  Scale  probably  measures  superstitiousness  and  this  is  very  ' 
much  related  to  the  sex  variable,  girls  being  more  superstitious  than 
boys.  In  view  of  this  it  is  suggested  that  the  test  is  not  an  ideal  measure 
of  the  belief  in  cause  and  effect  relationships,  though  no  better  test  seems 
to  be  available. 

Some  light  was  thrown  upon  the  Scale  used  to  measure  attitude  to 
investigation  and  discovery  of  knowledge.  The  analysis  indicated  that  the 
attitude  involved  is  related  to  success  in  the  school  science  examinations. 
It  may  be  that  an  attitude  to  school  is  involved.  The  scale  could  be 
related  to  the  industriousness  and  interest  factor  which  Vernon  (1961)  has 
discussed.  Another  interesting  possibility  arises  as  a  result  of  the  evi¬ 
dence  cited  by  Carey  (1958)  suggesting  that  a  major  reason  why  boys 
tend  to  perform  better  than  girls  in  problem-solving  tasks  is  that  boys 
have  a  more  favorable  attitude  to  problem-solving.  From  the  second 
factor  analysis  it  is  seen  that  the  factor  defined  by  the  scale  under  con¬ 
sideration  had  higher  loadings  on  the  school  examinations  than  on  STEP 
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tests  was  similar  to  that  between  the  loadings  of  the  corresponding  tests 
on  the  reasoning  factor.  It  may  be  that  there  would  be  overlap  between 
this  scale  and  Carey's  scale  to  measure  attitude  to  problem-solving. 

It  will  be  recalled  that  in  the  case  of  three  classes  of  students  in 
the  testing  project  the  scale  clearly  separated  students  rated  high  and 
low  by  their  teachers  for  curiosity.  It  did  not  differentiate  in  this  way 
for  students  not  in  the  project.  Thus  the  evidence  is  not  clear-cut  but 
there  is  at  least  a  possibility  that  the  scale  is  measuring  curiosity  or 
something  related  to  curiosity. 

The  evidence  indicates  that  further  investigation  needs  to  be  made 
to  clarify  what  this  scale  is  measuring,  and  it  suggests  that  such  inquiry 
would  be  worth  while. 

The  School  Examinations.  In  view  of  criticism  of  school  examinations, 
such  as  that  cited  at  the  beginning  of  this  thesis,  one  of  the  most  interest¬ 
ing  findings  of  this  investigation  was  the  extent  to  which  the  school  exami¬ 
nations  loaded  on  the  reasoning  factor,  especially  at  the  grade  X  level. 

It  appears  that  they  measure  both  knowledge  and  the  reasoning  aspect  of 
problem-solving  ability  to  a  satisfactory  extent.  Black  (I960)  found  that 
Science  XII  Departmental  examination  scores  in  Alberta  were  better  than 
other  Departmental  scores,  and  better  than  various  standardized  tests, 
as  predictors  of  freshman  success  at  the  University  of  Alberta.  The 
Science  XJ.I  score  used  was  an  average  of  the  Grade  XII  Science  exami¬ 
nations  written,  for  example,  Chemistry  XXX  plus  Biology  XXXII  and/or 
Physics  XXX.  It  may  well  be  that  the  extent  to  which  these  examinations 
sample  reasoning  ability  is  responsible  for  their  usefulness  as  predictors 
for  university  success. 
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An  inspection  of  the  school  examinations  shows  that  they  do  not 
contain  items  which  clearly  measure  the  kind  of  content  involved  in  the 
Test  of  Understanding  Science.  There  is  also  no  evidence  of  items 
which  would  give  measures  of  scientific  attitudes.  The  fact  that  the 
tests  were  found  to  load  on  one  of  the  attitude  factors  is  best  interpreted 
as  indicating  that  this  attitude  contributes  to  success  in  school  science. 


CHAPTER  VIII 


SOME  IMPLICATIONS  OF  THE  INVESTIGATION 

It  was  noted  at  the  beginning  of  this  thesis  that  Watson  and 
Cooley  (I960)  claimed  that  in  the  area  of  evaluation,  one  of  the  press¬ 
ing  needs  was  for  investigations  which  would  help  to  produce  an  under¬ 
standing  of  the  intellectual  processes  involved  in  different  kinds  of 
items,  and  of  the  relationships  of  these  processes  to  objectives.  The 
major  implication  of  the  present  study  is  to  emphasize  this  problem, 
but  at  the  same  time  to  give  some  leads  as  to  kinds  of  investigations 
which  are  likely  to  be  helpful. 

The  evidence  has  indicated  that  the  STEP  Science  test  items 
measure  verbal  ability  and  knowledge  mainly,  and  reasoning  to  only  a 
small  degree.  This  kind  of  item  is  similar  in  its  emphasis  upon  verbal 
content  to  many  other  items  that  are  suggested  as  suitable  measures  of 
scientific  thinking  or  critical  thinking,  for  example  those  by  Burnett 
(1957)  and  by  Nelson  (1958).  The  question  is  raised  as  to  whether  the 
items  suggested  by  these  and  other  authors  would  prove  any  more 
successful  than  those  of  the  STEP  test  when  they  place  so  much  weight 
on  verbal  ability.  Two  queries  arise  which  need  consideration.  The 
first  concerns  the  importance  of  verbal  and  reading  ability  in  the  study 
of  science.  The  second  relates  to  the  kind  of  item  which  is  best  calcu¬ 
lated  to  measure  reasoning  ability. 

It  is  not  suggested  here  that  verbal  ability  is  not  important  to 
problem-solving  in  science.  It  is  undoubtedly  important.  Roe  (1956) 
found,  in  an  investigation  of  the  abilities  of  various  groups  of  scientists, 
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that  they  were  all  high  in  verbal  ability  and  the  theoretical  physicists 
were  extremely  high.  What  is  claimed  here  is  that  if  reasoning  ability 
is  important  to  the  problem-solving  objective,  then  items  are  needed 
which  will  measure  it  and  not  be  just  measures  of  verbal  ability. 

Further,  if  special  training  is  needed  to  help  students  read  scientific 
literature  efficiently,  as  Hurd  (I960),  for  example,  implies  by  including 
this  as  one  of  the  objectives  of  science  teaching  which  he  lists,  then 
achievement  towards  this  goal  ought  to  be  evaluated  as  such,  and  ought 
not  to  dominate  a  test  of  reasoning. 

The  fact  that  the  STEP  items  had  a  small  loading  on  the  reasoning 
factor,  but  that  the  Science  9A  test  had  a  considerably  bigger  loading 
suggests  that  items  requiring  simple  application  can  be  good  measures 
of  reasoning.  The  Science  9B  test  involves  items  of  various  kinds 
including  an  essay,  matching  type  items,  problems  of  a  mathematical 
nature,  items  requiring  a  physical  explanation  for  a  phenomenon,  and 
items  involving  interpretation  of  diagrams.  Some  use  of  diagrams  is 
made  in  the  Science  9A  test  also.  It  seems  that  the  use  of  diagrams 
upon  which  questions  are  based  is  one  useful  method  of  minimizing  the 
verbal  content. 

It  is  suggested  that  an  investigation  needs  to  be  carried  out  involv¬ 
ing  tests  made  up  of  items  of  a  carefully  gradedkind  from  simple  factual 
recall  to  items  involving  simple  application  and  others  requiring  increas¬ 
ingly  difficult  application.  Some  tests  could  involve  items  of  a  verbal  kind 
and  others  could  be  made  largely  dependent  upon  diagrams.  Such  a  set 
of  tests  could  be  based  upon  the  different  classifications  set  out  in 
Bloom's  Taxonomy  of  Educational  Objectives  (1956).  Care  should  be 
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taken  to  see  that  the  tests  at  the  different  levels  of  difficulty  and  complexity 
are  reliable.  The  investigations  reported  in  this  thesis  indicate  that  a 
factor  analytic  study  of  such  tests  with  suitable  reference  variables  should 
throw  light  on  what  different  kinds  of  items  measure,  and  would  be  a  valu- 
able  contribution  to  test  construction. 

Another  aspect  of  evaluation  of  the  problem-solving  objective  which 
needs  attention  is  the  measurement  of  creativity  in  science.  Though  edu- 
cators  regard  this  as  important  there  has  not  been  too  much  investigation 
of  measuring  instruments  to  assess  creativity  in  school  science.  The 
recent  publications  of  Getzels  and  Jackson  (1962)  and  of  Torrance  (1962) 
indicate  that  tests  have  been  developed,  with  quite  high  reliabilities  for 
evaluating  aspects  of  creativity.  Some  attempts  might  well  be  made  to 
explore  the  possibility  of  adapting  some  of  these  ideas  to  testing  in  the 
field  of  science. 

Science  educators  commonly  regard  the  development  of  scientific 
attitudes  as  an  important  objective  of  science  teaching.  However,  at 
present  there  is  a  lack  of  instruments  to  provide  useful  assessments  in 
regard  to  this  objective.  It  seems  that  there  are  a  small  group  of  atti¬ 
tudes  which  could  be  classified  as  scientific  attitudes  among  which  are 
open-mindednCss ,  curiosity  and  belief  in  cause  and  effect  relationships. 

The  Dogmatism  Scale  produced  by  Rokeach  shows  prospects  of  being 
useful  in  the  school  situation.  It  has  been  suggested  in  this  thesis  that 
curiosity  is  a  desirable  attitude  for  all  citizens.  It  seems  to  be  especially 
important  for  the  scientist,  and  from  the  point  of  view  of  helping  to  develop 
future  scientists,  curiosity  is  probably  one  of  the  most  important  attitudes 
to  be  fostered.  An  instrument  which  would  validly  measure  curiosity 
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would  be  likely  to  be  useful  as  a  predictor,  along  with  other  measures, 
of  future  success  in  science.  An  attempt  has  been  made  in  the  present 
study  to  produce  a  scale  related  to  curiosity.  It  may  suggest  some  ideas 
which  would  be  helpful.  The  present  investigation  has  offered  some 
evidence  as  to  what  it  is  measuring  but  that  is  as  much  as  can  be  said. 

It  seems  that  a  major  project  in  this  area  is  needed.  The  work  of  Berlyne 
is  the  most  encouraging  sign  of  such  investigation. 

The  question  of  whether  or  not  the  evaluation  of  such  attitudes 
should  be  used  in  assessing  students'  grades  is  rather  controversial 
since  such  evaluation  is  heavily  dependent  upon  value  judgments.  'The 
National  Science  Teachers1  Association,  in  the  report  to  members  attend¬ 
ing  the  tenth  annual  convention  (1962),  took  the  view  that  the  development 
of  scientific  attitudes  is  an  important  objective,  but  that  their  evaluation 
should  not  be  used  in  determining  grades.  At  the  present  time  this  is 
the  most  satisfactory  position  to  hold  in  view  of  the  uncertainty  about 
the  measuring  techniques.  However,  the  present  investigation  has  indi¬ 
cated  that  a  test  of  understanding  science  involves  some  of  the  scientific' 
attitudes.  It  is  suggested  that  testing  of  this  kind  based  largely  on  factual 
knowledge  could  be  implemented  more  extensively  in  schools.  This 
should  assist  in  the  development  of  more  favorable  attitudes  to  science 
and  the  scientific  enterprise,  and  is  likely  to  be  of  value  in  promoting 
some  of  the  other  attitudes  like  open-mindedness,  and  belief  in  cause 
and  effect  relationships. 
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TEST  OF  APPLICATION  OF  SCIENTIFIC  KNOWLEDGE 

This  is  a  test  of  your  ability  to  apply  the  scientific  knowledge 
you  have  learned. 

Each  of  the  questions  or  incomplete  statements  in  this  test  is 
followed  by  four  suggested  answers.  You  are  to  decide  which  one  of 
these  you  think  is  the  best  answer.  Then  place  a  circle  around  the 
letter  belonging  to  this  answer.’1'  Mark  only  one  answer  for  each  question. 

Sample 

Many  thermostats  for  controlling  heating  devices  depend  on  the 

bending  of  a  bimetallic  strip  as  it  is  heated.  The  best  explana¬ 
tion  of  this  bending  is 

A  the  bimetallic  strip  gets  soft  as  it  is  heated 

B  one  of  the  metals  in  the  strip  expands  forcing  the  other  to 
bend 

❖  C  both  of  the  metals  in  the  strip  expand  but  one  expands 
more  than  the  other 

D  one  of  the  metals  conducts  the  heat  away  better  than  the 
othe  r 

Since  C  is  the  best  answer  it  is  the  one  chosen. 

You  will  make  your  best  score  by  answering  every  question. 

Work  carefully,  but  do  not  spend  too  much  time  on  any  one  item.  If  a 
question  seems  too  difficult  make  the  most  thoughtful  selection  you  can, 
and  if  you  finish  before  time  is  called,  go  back  and  spend  more  time  on 
such  questions. 

*In  this  copy  the  correct  answer  for  each  item  has  been  indicated 
by  an  asterisk. 
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Questions  1  to  5 

1.  Jack  was  enjoying  a  holiday  at  a  scout  camp.  One  night  as  they 
looked  at  the  bright  starry  sky,  the  scoutmaster  recalled  that  the 
starfe  were  said  to  be  like  our  sun.  He  explained  that  the  reason  we 
get  more  heat  from  the  sun  than  the  stars  is  that 

A  the  gas  in  the  stars  only  burns  occasionally  as  they  twinkle 
*  B  the  earth  is  closer  to  the  sun  than  to  the  stars 
C  the  stars  are  much  smaller  than  the  sun 

D  the  sun  is  a  better  radiator  than  the  stars  are,  because  it  is  a 
duller  colour. 

2.  One  day  they  were  discussing  the  functioning  of  the  human  body. 
When  one  boy  asked  how  the  body  maintained  a  consistent  heat  level, 
the  scoutmaster  explained  that  cooling  was  carried  out  mainly  by 

*A  evaporation  of  perspiration  at  the  body  surface 
B  the  effect  of  breezes  on  perspiration  at  the  body  surface 
C  conduction,  convection,  and  radiation  at  the  body  surface 
D  absorption  of  heat  from  the  skin  by  the  surrounding  air. 

3.  One  of  the  boys  related  how  the  doctor  had  tested  his  hearing  when 
he  had  a  blockage  in  one  of  his  ears.  The  doctor  had  placed  a 
vibrating  tuning  fork  on  the  middle  of  the  top  of  his  head  and  asked 
him  if  he  could  hear  it  and  if  it  sounddd  more  in  one  ear  than  the 
other.  He  said  the  sound  was  the  same  in  each  ear.  The  best 
explanation  of  this  is 

A  the  vibrations  loosened  the  blockage  and  allowed  him  to  hear 


through  each  ear 
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B  the  vibrations  were  transmitted  more  quickly  through  the  bones 
than  through  the  air 

*  C  the  vibrations  were  conducted  by  the  bones  to  the  inner  hearing 

mechanism  in  each  ear 

D  the  vibrations  were  transmitted  more  slowly  through  the  bones 
than  through  the  air. 

4.  Jack  explained  the  functioning  of  the  vocal  cords  which  are  mem¬ 
branes  stretched  across  the  voice  box:.  Which  one  of  the  following 
explanations  is  correct? 

❖  A  a  change  in  voice  pitch  can  be  effected  by  stretching  or  relaxing 

the  vocal  cords  by  means  of  muscles 
B  women’s  voices  are  generally  more  highly  pitched  than  men's 
voices  because  their  vocal  cords  are  longer 
C  a  man’s  voice  is  deeper  than  a  boy's  voice  because  the  vocal 
cords  stretch  as  the  boy  grows  to  be  a  man 
D  a  person  makes  his  voice  louder  by  causing  the  muscles  to 
tighten  the  vocal  cords. 

5.  One  day  during  a  hike  all  the  boys  were  sitting  around  having  lunch 
when  a  boy  startled  his  mates  with  the  sound  produced  by  blowing 
across  the  top  of  his  water  bottle.  Which,  one  of  the  following  state¬ 
ments  about  this  is  correct? 

A  the  frequency  of  the  vibrations  would  be  higher  if  the  boy  made 
the  sound  after  drinking  more  of  the  liquid 
B  the  wave  length  of  the  sound  wave  equals  the  distance  from  the 
top  of  the  bottle  to  the  liquid 
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*  C  the  pitch  of  the  sound  would  be  lower  if  he  made  it  after  drinking 

more  of  the  liquid 

D  when  the  sound  is  made  the  standing  wave  in  the  bottle  has  an 
antinode  at  each  end. 

Questions  6  to  1 2 

Imagine  you  have  become  interested  in  space  travel  as  a  result 
of  all  the  news  items  and  magazine  articles  about  it. 

6.  After  reading  a  book  on  plans  for  reaching  and  exploring  the  moon, 
you  recall  with  some  amusement  how  comics  used  to  portray 
spacemen  as  just  equipped  with  masks  and  oxygen  supplies.  It  is 
now  realized  that  real  spacemen  who  might  land  on  the  moon  will 
need  to  be  completely  enclosed  in  an  airtight  spacesuit.  Which  of 
the  following  pairs  of  reasons  best  accounts  for  this  fact? 

A  lack  of  oxygen  and  rapid  loss  of  perspiration 

*  B  great  heat  and  low  air  pressure 
C  cosmic  rays  and  great  heat 

D  severe  winds  and  rapid  loss  of  perspiration 

7.  When  the  spacecraft  carrying  the  first  man  to  set  foot  on  the  moon 
prepares  to  land,  it  will  need  to  be  slowed  down  gradually;  which 
of  the  following  techniques  is  most  likely  to  be  used? 

A  firing  rockets  perpendicular  to  its  path 
>;<  B  firing  rockets  towards  the  moon 
C  extending  large  wings  from  its  sides 
D  releasing  a  large  parachute  behind  it. 
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8.  There  will  be  many  problems  to  be  overcome  in  equipping  men  to 
explore  the  moon.  Which  is  the  most  important  of  the  following 
problems  to  be  considered  by  the  planners? 

A  providing  means  to  keep  their  bodies  from  rising  off  the  moon's 
surface 

B  protecting  their  equipment  from  the  effects  of  wind  and  rain 
C  providing  equipment  to  enable  them  to  talk  to  one  another 
*  D  protecting  their  equipment  from  the  effects  of  heat  and  cold. 

9.  Further  problems  are  involved  in  the  return  of  the  spacecraft  to 
earth.  Which  of  the  following  do  you  think  is  the  most  difficult  to 
deal  with? 

*A  preventing  the  vehicle  from  burning  up  as  it  passes  through  the 
earth's  atmosphere 

B  slowing  the  vehicle  down  as  it  prepares  to  land 
C  preventing  the  vehicle  from  spinning  as  it  travels 
D  preventing  the  vehicle  from  being  deflected  by  currents  in  the 
atmosphere. 

10.  In  reading  about  the  design  of  spacecraft,  you  find  that  the  actual 
spacecraft  is  enclosed  in  the  launch  vehicle  for  its  journey  up 
through  the  atmosphere.  Keeping  this  in  mind,  which  one  of  the 
following  would  be  LEAST  important  if  the  return  of  the  craft  to 
earth  is  not  required? 

A  colour 
B  weight 

C  construction  material 
*D  shape 
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11.  You  read  some  interesting  details  of  ways  in  which  the  air  in  the 
upper  atmosphere  is  examined  e.  g.  Aerobee  rockets  have  carried 
sealed,  evacuated,  steel  bottles  to  a  height  of  50  miles  where 
automatic  instruments  opened  them  for  5  seconds,  and  they  were 
then  parachuted  to  earth.  The  bottle  capacity  was  10  quarts.  It 
was  found  that  when  the  air  taken  in  by  the  bottles  was  subjected  to 
a  pressure  of  1  atmosphere,  it  only  occupied  a  volume  of  1/10,  000 
quart.  Which  of  the  following  gives  the  closest  approximation  to 
the  pressure  at  50  miles  up? 

A  1/10  atmosphere 
B  1  /  1 , 000  atmosphere 
C  1  /  1 0,  000  atmosphere 
*  D  1/100,000  atmosphere 

12.  In  thinking  about  the  possibility  of  travelling  beyond  the  moon  to 
the  other  planets  you  read  some  details  about  temperatures  on 
various  planets.  Those  which  are  further  away  than  Mars  have 
very  low  temperatures.  .  All  of  the  following  factors  influence  the 
average  daily  temperature  of  a  given  spot  on  a  planet  EXCEPT 

A  the  rate  of  inflow  of  solar  energy  from  the  sup 
B  the  rate  of  outflow  of  heat  from  the  centre  of  the  planet 
*C  the  longitude  of  the  point  on  the  planet's  surface 
D  the  presence  or  absence  of  an  atmosphere  around  the  planet 

Questions  13  to  15 

13.  While  visiting  an  industrial  research  laboratory  Bill,  was  watching 
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a  chemist  measuring  impurities  in  a  chemical.  The  chemist  illu¬ 
strated  the  method  he  used  with  the  following  chart,  which  shows 
the  effect  on  the  tejnpe rature  of  a  sodium  hydroxide  solution  when 
hydrochloric  acid  was  added  to  it  at  a  slow  constant  rate.  Which  of 
Bill's  attempted  explanations  of  the  chart  is  correct:? 


>:<A  the  chemist  had  begun  to  add  the  acid  at  point  B 
B  the  last  of  the  sodium  hydroxide  had  just  reacted  at  point  D 
C  the  last  of  the  acid  was  added  at  point  C 
D  the  reaction  is  an  endothermic  one. 


14.  The  chemist  showed  Bill  the  equipment  to  deal  with  some  of  the 
problems  involved  in  avoiding  loss  of  the  heat  produced  in  the 
reaction.  Of  the  following,  which  would  be  the  main  problem  to  be 
dealt,  with? 

A  to  avoid  loss  of  heat  due  to  evaporation  of  the  solution 
*B  to  avoid  heat  transfer  to  the  containing  vessel 
C  to  prevent  the  solution  from  boiling 
D  to  prevent  heat  loss  to  the  air  by  convection 

15.  Bill  noticed  some  equations  which  the  chemist  had  been  writing. 
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He  decided  correctly  that  all  the  following  might  represent  reaction 
between  the  acid  and  the  sodium  hydroxide  EXCEPT 
A  NaOH  +  HC1  - ^NaCl  +  H?0 

A 

*B  2Na  +  2HC.1 - ^  2NaC.l  +  Hz  I. 

C  OH"  +  H+  - 5rH2° 

D  OH"  +  H30+ - ^2H20 

Questions  16  to  22 

Bob  had  been  reading  a  scientific  publication  and  discussed  with 
you  some  of  the  new  developments  mentioned  in  it. 

16.  He  read  that  experiments  are  being  carried  out  by  the  University  of 
Alaska  in  which  lampblack  is  spread  on  the  snow.  Which  is  the  best 
of  the  following  reasons  suggested  by  Bob? 

A  to  protect  those  areas  of  the  snow  from  the  sun's  heat 
B  to  lower  the  melting  point  of  the  snow 

C  to  maintain  areas  of  snow  through  the  summer  for  cooling  purposes 
*D  to  cause  those  areas  of  the  snow  to  absorb  more  heat 

17.  One  of  the  advertisements  in  the  magazine  which  caught  Bob's 
attention  was  for  "light  guides.  !'  A  light  guide  is  a  tube  made  up 
of  a  large  number  of  long  fibres  and  when  light  is  shone  on  one  end 
of  it,  the  light  is  carried  through  with  little  loss  to  the  other  end, 
even  when  the  guide  is  bent.  Thus  it  can  be  used  to  transmit  light 
to  inaccessible  places.  Which,  of  the  following  is  the  most  likely 
explanation  of  its  operation? 

A  the  material,  of  the  guide  is  not  dependent  on  the  laws  of  reflec¬ 
tion  and  refraction 
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*B  the  light  is  reflected  internally  many  times  in  the  guide  until  it 
finally  emerges 

C  because  light  travels  so  much  faster  in  the  guide  than  in  air  it  is 
able  to  travel  through  with  little  loss 
D  refraction  of  the  light  occurs  at  each  bend  in  the  guide. 

18.  The  magazine  discussed  some  of  the  ways  in  which  the  sun's  heat  is 
being  harnessed,  such  as  by  solar  batteries,  cookers  and  furnaces, 
e.  g.  at  a  dozen  or  so  places  solar  furnaces  consisting  of  large 
mirrors  are  being  used  to  concentrate  the  sun's  rays  so  well  that 
metals  are  melted  by  them.  In  thinking  about  this  you  realize  that 
there  would  be  certain  problems  in  building  a  mirror  for  this 
purpose.  Which  of  the  following  would  be  the  most  important? 

A  to  prevent  the  mirror  itself  melting 

B  to  avoid  buckling  of  the  mirror  due  to  uneven  expansion 
C  to  remove  the  air  between  the  mirror  and  the  target  to  prevent 
heat  loss 

*D  to  focus  the  sun's  rays  on  a  small  spot  in  order  to  get  the  maxi¬ 
mum  heat. 

19.  One  of  the  articles  in  the  magazine  which  interested  Bob  was  on 
monomolecular  films,  i.  e.  layers  one  molecule  thick.  He  read 
that  many  physical,  chemical  and  biological  processes  take  place 
at  the  surfaces  of  substances  and  that  nomomolecular  films  are 
often  important  in  these.  Keeping  in  mind  that  a  surface  is  gener¬ 
ally  a  boundary  between  two  different  substances,  you  realize  that 
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surface  processes  could  occur  for  each  of  the  following  combinations 
EXCEPT 
A  solid  and  gas 
B  gas  and  liquid 
C  liquid  and  liquid 
*D  gas  and  gas 

20.  Bob  asked  if  you  could  remember  from  your  science  course  how 
magnesium  ribbon  placed  in  a  bunsen  flame  burned  with  a  brilliant 
white  flame,  producing  a  white  ash.  He  said  that  this  white  material 
is  used  as  an  insulating  material.  This  caused  you  to  consider  what 
some  of  its  properties  would  probably  be.  All  of  the  following  would 
be  correct  EXCEPT 

A  it  would  be  a  poor  conductor  of  heat 

I 

B  it  would  be  light  because  magnesium  is  a  less  dense  metal  than 
aluminium 

C  it  would  not  melt  or  burn  easily 
*D  it  would  not  be  attacked  if  splashed  by  acids. 

21.  One  of  the  articles  also  discussed  various  applications  and  phenomena 
relating  to  sound.  Bob  said  that  it  set  him  thinking  and  one  thing  that 
he  wondered  about  was  the  explanation  for  the  crack  of  a  whip.  Of 
the  following,  the  most  likely  explanation  is 

❖A  when  a  whip  is  "cracked"  the  tip  breaks  the  sound  barrier 
B  the  tip  vibrates  at  a  high  frequency  for  a  short  time 
C  stationary  or  standing  waves  are  set  up  in  the  whip  due  to  reflec¬ 
tion  of  the  waves  at  the  free  end 
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D  the  tip  strikes  another  part,  of  the  whip  very  hard 

22.  Bob  had  heard  of  Freon  as  a  refrigerant  but  now  he  discovered 
that  it  was  really  a  trade  name  for  a  group  of  substances  called 
fluorocarbons  used  for  various  purposes.  He  discovered  the 
following  details  about  them: 


Name 

"Freon- 115" 

"Freon- 116" 

"Freon -\C  318" 

Chemical  Formula 

CF3CFzC1 

CF3CF3 

c4f8 

Boiling  Point  (deg.  F) 

-37.  7 

1 

1 — ■ 

0 

00 

00 

21.  1 

Using  these  figures  only,  which  is  the  most  acceptable  conclusion 
he  could  draw? 

A  "Freon-  1 1  5"  is  a  mixture  of  carbon  (C),  fluorine  (F),  and 

chlorine  (Cl)  whereas  "Freon-116"  is  a  mixture  of  carbon  and 
fluorine  only 

B  as  temperatures  are  lowered  "Freon-C  318"  will  become  a 
liquid  before  "Freon-116" 

C  the  fewer  kinds  of  atoms  in  a  fluorocarbon  molecule  the  higher 
its  boiling  point 

D  the  greater  the  number  of  fluorine  atoms  in  the  fluorocarbon 
molecule  the  higher  the  boiling  point. 

Questions  23  to  25 

You  live  in  a  town  in  an  area  which  has  a  low  rainfall  but  it  is 
near  the  sea.  Industrial  development  and  population  growth  have  caused 
a  water-supply  problem  and  you  are  on  a  committee  to  investigate  the 
possibility  of  obtaining  salt- free  water  from  sea-water. 

23.  As  a  starting  point  you  looked  up  the  solubility  of  sodium  chloride 
and  found  the  following  data 
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Solubility  of  sodium  chloride  (in  grams)  in  100  grams  of  water  at  various 

temperatures 

0°C  20°  60°  100°C 

35.  7  gms.  36.  0  gms.  37.  5  gms.  39.  8  gms. 

You  also  found  that  sea-water  contains  about  3  grams  of  sodium 
chloride  per  100  grams  of  salt-water. 

On  the  basis  of  these  facts  you  concluded  that 

A  the  salt  would  all  separate  out  if  the  salt  water  were  cooled  to 
ice  temperature 

B  boiling  the  salt  water  would  be  a  help  because  the  solubility  of 
salt  decreases  with  increasing  temperature 
*C  no  salt  would  separate  out  even  if  the  salt  water  were  cooled  to 
ice  temperature 

D  some  but  not  all  the  salt  would  separate  out  if  the  salt  water  were 
cooled  to  ice  temperature. 

24.  A  member  of  the  committee  obtained  some  information  about  a  new 
method  being  developed  by  the  Koppers  Company  Incorporated.  The 
following  is  a  summary  of  the  process: 

".  .  .  sea  water  is  cooled  to  35  degrees  F  and  mixed  in  a  reaction 
vessel  with  propane,  a  volatile  organic  liquid.  Hydrate  crystals 
form  with  about  1.7  molecules  of  water  to  each  molecule  of  propane, 
and  heat  is  given  off.  "  Excess  propane  vaporizes.  "After  being 
washed  free  from  sa.lt  the  crystals  pass  to  a  decomposing  chamber. 
The  vaporized  propane  is  also  pumped  into  this  chamber  under 
pressure.  "  The  propane  condenses  and  the  hydrate  crystals  melt. 
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Two  liquid  layers  form,  propane  on  top  of  the  water;  the  water  is 
separated  and  the  propane  sent  back  to  the  reaction  vessel. 

You  discussed  this  information  with  him.  Of  the  following 
conclusions  stated  in  the  discussion  which  is  correct? 

A  propane  dissolves  readily  in  salt  water  but  not  in  ordinary  water 
B  a  mixture  of  salt-water  and  propane  freezes  at  35°F. 

*C  propane  vaporizes  more  readily  than  salt  water 
D  Salt  is  more  soluble  in  propane  than  in  water. 

25.  You  attempted  to  explain  some  aspects  of  the  process.  All  the 
following  explanations  are  correct  EXCEPT 

A  latent  heat  of  vaporization  is  released  when  propane  vapor 
condenses 

B  the  latent  heat  of  fusion  of  the  hydrate  crystals  is  supplied  by 
heat  released  when  the  hydrate  crystals  form 
C  latent  heat  of  vaporization  of  propane  is  supplied  by  heat  given 
off  when  the  hydrate  crystals  form 
*D  the  latent  heat  of  fusion  of  the  propane  when  given  up  melts  the 


hydrate  crystals. 
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SCIENTIFIC  ATTITUDE  SCALE  II 

The  following  is  a  list  of  statements  expressing  opinions  commonly 
held  by  people  about  investigation  and  discovery  of  knowledge.  Many 
points  of  view  are  expressed  so  that  you  will  probably  find  yourself 
agreeing  with  some  statements  and  disagreeing  with  others,  and  you 
can  be  sure  that  many  people  feel  the  same  as  you  do.  The  best  answer, 
then,  in  each  case  is  your  personal  opinion. 

Check  (j/  )  each  statement  which  expresses  an  opinion  which  you 
hold,  about  investigation  and  discovery  of  knowledge.  Do  not  mark  the 
other  statements. 

Sample 

\y  People  who  are  always  asking  questions  are  a  nuisance. 

If  this  statement  expresses  an  opinion  which  you  hold,  you  would 
mark  it  as  shown.  If  it  does  not  express  an  opinion  which  you  hold,  you 
would  not  mark  it  at  all. 

1.  There  are  some  areas  of  knowledge  which  are  not  meant  to  be 
que  stioned. 

2.  I  find  the  unknown  is  a  challenge  to  me. 

3.  My  curiosity  drives  me  to  seek  answers  even  when  my  questions 
are  unpopular. 

4.  New  places  and  new  objects  make  me  feel  uneasy. 

5.  I  am  quite  interested  in  the  new  things  which  science  has  given 
us,  but  sometimes  I  think  we  would  be  better  off  without  them. 

I  think  too  much  knowledge  can  sometimes  be  a  bad  thing. 


6. 
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7.  A  person  can't  expect  to  get  anywhere  if  he  doesn't  use  his  initiative. 

8.  Knowledge  is  more  important  than  being  popular. 

9.  A  person  who  has  too  much  knowledge  is  too  wrapped  up  to  enjoy 
living. 

10.  New  discoveries  are  good  so  long  as  they  help  in  improving  the 
world. 

11.  If  people  cannot  answer  a  question,  I  usually' turn  to  other  sources 
of  information. 

12.  I  believe  it  is  more  important  to  get  along  with  people  than  to  pur¬ 
sue  unpopular  questions. 

13.  I  think  there  is  something  to  be  said  for  going  to  new  places  but  I 
am  more  at  home  with  familiar  places  and  things. 

14.  I  would  rather  listen  to  other  people's  questions  than  ask  questions 
myself. 

15.  There  can  never  be  too  much  knowledge. 

1  6.  Curiosity  is  a  good  thing  but  sometimes  it  can  be  a  bad  thing. 

17.  I  usually  read  for  information  rather  than  pleasure. 

18.  I'm  more  at  home  with  familiar  places  and  familiar  things. 

19.  If  rny  questions  are  unpopular  I  usually  drop  the  subject. 

20.  I  rarely  ask  questions,  but  I  like  to  find  out  new  and  different 
things. 

21.  Too  much  knowledge  can  drive  a  person  crazy. 


. 
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